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1.  INTRODUCTION 


This  is  the  final  report  describing  CNVEO  sponsored  research  at  Purdue  Robot  Vision  Lab. 
This  research  program  had  three  major  goals:  1)  Evaluation  of  the  currently  used  image  pro¬ 
cessing  and  pattern  classification  procedures  for  FLIR  data;  2)  Development  of  algorithms  for 
LADAR  imagery;  and  3)  Development  of  techniques  for  Electronic  Terrain  Board  Modeling. 
We  believe  that  these  three  goals  are  all  vital  to  the  advancement  of  ATR  science,  in  the  sense 
that  their  synergy  will  be  reflected  in  most  breakthroughs  of  the  future. 

As  far  as  FLIR  processing  is  concerned,  our  work  centered  on  measuring  the  information 
content  of  FLIR  features  from  the  standpoint  of  interclass  separability,  and  an  evaluation  of 
image  and  algorithm  metrics.  We  also  strived  to  improve  the  different  aspects  of  FLIR  process¬ 
ing.  Since  image  segmentation  is  a  critical  step  in  this  processing,  we  developed  several  dif¬ 
ferent  approaches  to  the  problem.  Our  motivation  was  simply  to  see  which  techniques  might  be 
best  suited  for  FLIR. 

We  believe  that  laser  and  millimeter  radar  range  data,  when  available,  would  serve  as  an 
important  adjunct  to  FLIR  information  for  target  classification.  Our  hope  is  that  a  synergistic 
integration  of  our  current  LADAR  work  with  the  developments  made  (and  to  be  made)  in  FLIR 
would  probably  lead  to  the  most  effective  procedures  for  ATR.  It  is  entirely  possible  that  in  the 
ATR  systems  of  the  future,  FLIR  would  play  the  role  of  focusing  the  attention  of  a  processor  at 
potential  targets,  a  laser-based  system  could  then  examine  the  geometrical  attributes  of  spatial 
data  in  the  neighborhoods  of  these  points  to  confirm  or  disconfirm  the  initial  guesses  and  to  also 
classify  the  detected  targets. 

As  is  well  known,  during  the  past  few  years  many  algorithms  have  been  developed  by  us 
(and  many  other  researchers  in  the  country)  for  analyzing  range  maps  using  geometrical 
approaches.  These  algorithms  attempt  to  first  extract  from  a  3-D  scene  the  constituent  surfaces 
of  objects,  and  then  inferences  about  the  objects  are  generated  by  reasoning  over  these  surfaces 
and  their  relationships.  Although  it  is  entirely  possible  that  the  available  resolution  today  would 
not  permit  the  application  of  such  geometrical  approaches  for  target  identification,  we  neverthe¬ 
less  think  that  such  processing  schemes  should  not  be  entirely  ignored  for  LADAR  imagery. 
There  is  always  the  chance  that  even  today  such  schemes  could  pay  off  at  close  ranges,  and  then 
there  is  always  the  possibility  that  future  LADAR  systems  would  have  much  higher  resolution  - 
making  geometrical  approaches  the  method  of  choice. 

To  augment  our  development  of  new  algorithms  for  LADAR,  we  also  worked  on  Elec¬ 
tronic  Terrain  Board  Modeling.  The  ETBM  work  is  based  on  the  following  rationale:  Although 
it  is  desirable  to  use  actual  sensor  data  for  the  development  and  testing  of  ATR  algorithms, 
unfortunately  such  data  is  not  always  readily  available  -  owing  to  the  fact  that  field  tests  are 
expensive  and  often  postponed,  not  to  mention  the  physical  impossibility  of  carrying  out  experi¬ 
mentation  on  all  conceivable  configurations  of  targets,  sensors,  environmental  conditions  and 
terrain  make-up.  Anyone  with  a  validated  ETBM  system  should  be  able  to  run  “electronic” 
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field  tests  and  generate  realistic  data  without  the  time  and  cost  of  a  conventional  field  test. 

The  use  of  ETBM  for  modeling  also  provided  another  payoff:  the  development  of  a 
multi-resolution  data  structure  for  model-based  geometric  reasoning.  This  technique  provides  a 
natural  means  of  degrading  the  more  complex  geometric  models  of  targets  (which  we  hope  will 
be  applicable  in  future  high-resolution  LADAR  processing)  to  simpler  silhouette-like  models 
(which  are  applicable  to  the  present  resolution)  by  utilizing  the  relationship  between  discernible 
target  detail  and  the  range  of  the  target. 

In  1986,  the  first  year  of  the  effort,  we  largely  fulfilled  the  requirements  of  the  first  main 
goal  by  creating  the  software  for  testing  the  classifiability  of  FLIR  features.  The  software  was 
based  on  Parzen  estimation  techniques.  In  1987  and  1988,  we  concurrently  carried  out  investi¬ 
gations  into  LADAR  processing  and  ETBM.  In  1987,  we  showed  how  by  a  combination  of  solid 
modeling  for  targets,  fractal  representation  for  topography,  and  productions  for  terrain  features, 
we  could  construct  synthetic  images  for  the  purpose  of  testing  LADAR  algorithms.  Another 
highlight  of  1987  was  our  initial  attempt  at  the  development  of  LADAR  algorithms  for  recog¬ 
nizing  objects  on  the  basis  of  their  geometrical  attributes.  Being  preliminary  in  nature,  the  algo¬ 
rithms  assumed  that  the  sensor  was  located  at  a  particular  vantage  point  with  respect  to  the  tar¬ 
get.  Our  current  work  makes  such  assumptions  unnecessary  and  the  targets  can  be  in  any  pose 
in  relation  to  the  sensors.  We  also  gained  a  better  understanding  of  how  the  modeling  work  in 
ETBM  can  aid  in  the  evaluation  of  LADAR  algorithms.  The  fact  that  we  are  using  target  and 
terrain  modeling  for  algorithm  evaluation  proves  an  important  point  we  have  made  all  along 
that  to  score  future  breakthroughs  in  the  ATR  science  there  must  occur  concomitant  develop¬ 
ments  in  LADAR,  ETBM  and  FLIR. 

We  now  describe  our  major  accomplishments  as  they  relate  to  our  three  main  goals. 


1.1  FLIR  PROCESSING 

The  following  are  our  achievements  in  working  with  FLIR  imagery.  Our  goals  in  this  area 
were  to  evaluate  the  current  state  of  the  art  in  FLIR  data  processing,  to  improve  FLIR  segmenta¬ 
tion  techniques,  and  to  utilize  high  level  reasoning  to  improve  the  classification  of  FLIR  data. 

1.1.1  Evaluation  of  FLIR  Processing 

Information  Content  of  FLIR  Features  for  Interclass  Separation 

We  set  out  to  measure  information  content  of  FLIR  features  from  the  standpoint  of  inter¬ 
class  separability  by  testing  the  following  conjecture: 

In  a  typical  single  static  FLIR  frame  there  does  not  exist  enough  information  for 
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target  classification.  [It  is  important  to  note  that  excluded  from  our  consideration  are 
close  range  images  or  images  taken  under  ideal  environmental  conditions.] 

The  software  that  we  developed  for  this  purpose  includes  modules  for  Parzen  estimation  tech¬ 
niques  that  are  used  for  obtaining  lower  and  upper  bounds  on  classification  errors.  We  believe 
that  the  upper  bound  thus  obtained  is  tight  and  a  good  measure  of  the  classification  information 
contained  in  a  given  set  of  features.  For  comparison,  we  also  have  a  module  that  for  a  given  set 
of  features  computes  the  Bhattacharya  distance  between  two  classes  as  a  measure  of  interclass 
separability.  In  order  to  conduct  a  full  scale  study  on  FLIR  features,  we  compiled  a  superset  of 
the  FLIR  features  used  by  NVL  contractors. 

AH  the  software  that  was  developed  in  the  first  trimester  for  measuring  classifiability  of 
FLIR  features  was  applied  to  NVL  simulated  terrain  board  data.  Unfortunately,  this  data  set 
could  not  be  used  to  either  prove  or  disprove  our  non-classifiability  conjecture  because  of  inade¬ 
quate  viewpoint  spread  in  the  acquired  imagery.  In  the  azimuthal  plane,  the  viewpoints  were 
45°  apart  (which  is  too  large  a  spread  for  our  study);  moreover,  there  was  no  variation  with  the 
elevation  angle.  Also,  many  of  the  images  were  taken  from  exactly  the  same  viewpoint. 

We  also  ran  our  classifiability  software  on  the  BRUT  data.  Since  for  some  of  the  target 
types  we  did  not  have  enough  images  for  a  single  range,  we  had  to  group  together  images  taken 
at  different  ranges  to  build  a  large  enough  sample  size  for  statistically  meaningful  conclusions. 
On  hand-picked  target  images  of  good  quality  and  using  the  bounded-rectangle  method  for  seg¬ 
mentation,  our  computed  lower  bound  on  classification  error  was  43%  using  gray  scale  features. 
(As  we  have  explained,  it  is  not  possible  to  use  all  the  features  at  the  same  time,  because  with 
the  resulting  high-dimensionality  of  the  feature  space  we  are  also  required  to  have  a  correspond¬ 
ingly  large  sample  size.  So  for  any  classification  study,  the  dimensionality  of  the  selected 
feature  set  is  limited  by  how  much  data  is  available  for  training  the  classifier.) 

It  is  possible  that  we  could  reduce  the  43%  classification  error  estimate  if  we  used  a  supe¬ 
rior  segmentation  strategy,  such  as  those  derived  from  wire  frame  models.  The  error  estimate 
would  surely  go  up  if  we  did  not  limit  ourselves  to  good  quality  hand-picked  images.  At  the 
time,  it  was  too  early  to  tell  whether  these  results  proved  or  disproved  our  non-classifiability 
conjecture.  All  these  preliminary  results  are  presented  in  Section  2. 1.1.4. 

We  improved  our  software  for  testing  the  classifiability  of  FLIR  features  by  incorporating 
in  the  software  an  advanced  Parzen  estimator  for  computing  bounds  on  classification  error. 
This  advanced  estimator,  based  on  Prof.  Fukunaga’s  recent  work,  makes  decision  thresholds 
also  a  function  of  the  class  covariances,  as  opposed  to  only  their  prior  probabilities,  which  is  the 
case  with  the  more  traditional  Parzen  estimators.  Other  upgrades  to  this  software  consisted  of 
our  adding  more  features  to  the  set  we  had  reported  on  before,  and  the  use  of  a  superior  segmen¬ 
tation  algorithm.  Using  these  changes  on  the  same  BRITT  data  resulted  in  the  classification 
error  for  grey  scale  features  to  change  from  a  lower  bound  of  2.2%  to  a  lower  bound  of 
12.7%.  These  results  are  reported  in  Section  2. 1.1.4. 
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Algorithm  and  Image  Metrics 

We  set  ou;  rr>  investigate  algorithm  and  image  metrics  for  ATT?  characterization.  The  con¬ 
clusion  that  we  have  arrived  at  with  regard  to  image  metrics  is  that  the  independence  of  metrics 
is  probably  a  necessary,  but  definitely  not  a  sufficient,  criterion  for  their  selection.  As  stated  in 
Section  2.1.2,  we  believe  that  the  dependence  of  a  metric  on  different  variables  must  be 
discovered  by  theoretically  analyzing  algorithms  for  their  performance,  as  opposed  to  by  heuris¬ 
tic  specification.  The  point  being  made  here  is  that  if,  say,  our  goal  is  target  detection  and  we 
wish  to  characterize  the  complexity  of  an  image  with  regard  to  target  detection,  we  should 
theoretically  analyze  the  algorithm  used  for  the  purpose  and  thus  discover  the  form  of  the 
metric.  Such  a  metric  would  be  both  task  oriented  and  algorithm  dependent,  which  is  how  it 
should  be.  There  can  be  no  absolute  measures  of  image  complexity  even  for  specific  processes 
such  as  target  detection.  Since  it  is  possible  to  use  different  algorithms  for  detection,  the  meas¬ 
ure  of  complexity  must  take  into  account  the  nature  of  the  algorithm. 

While  in  the  first  trimester  we  showed  that  TIR  2  was  not  a  reliable  metric  for  target  detec¬ 
tion,  in  the  second  trimester  we  have  established  that  TBIR 2  also  suffers  from  serious 
deficiencies  when  it  comes  to  measuring  image  complexity  with  regard  to  target  segmenta¬ 
tion.  These  results  are  presented  in  Section  2.1.2. 

As  a  first  step  in  our  attempts  to  come  up  with  new  metrics,  we  have  in  Section  2. 1.2. 2 
proposed  a  method  that  might  be  able  to  measure  the  complexity  of  images  from  the  standpoint 
of  segmentation  using  thresholding.  Although  it  has  fared  better  than  the  TBIR 2  metric  in 
assessing  the  difficulty  of  segmentation  on  the  images  we  have  tested  it  on,  we  are  not  yet  ready 
to  give  it  the  label  of  a  new  metric  as  many  questions  that  it  has  raised  remain  unanswered. 

1.1.2  Two  FLIR  Segmenters 

EGT 


Section  2.2.1  presents  results  obtained  with  our  new  edge  guided  segmenter  for  FLIR  data. 
This  segmenter  is  much  simpler  than  the  Hughes  segmenter,  yet  its  performance  is  comparable, 
at  least  on  the  images  that  both  were  tested  on.  Our  segmentation  algorithm  is  based  on  the  fact 
that  in  the  traditional  histogram  based  procedures  the  hardest  problem  is  the  selection  of  a  good 
threshold.  Our  contention  is  that  in  the  vicinity  of  the  valley  where  a  good  threshold  might  be 
placed  to  separate  the  target  from  the  background,  the  shape  of  the  histogram  is  distorted  by  the 
boundary  pixels.  That  is  because  the  boundary  pixels  have  gray  levels  that  are  intermediate 
between  the  target  and  the  background.  Therefore,  in  our  new  segmentation  algorithm,  we 
delete  the  contributions  made  to  the  histogram  by  boundary  pixels. 


Tree  Traversal 


Our  long  term  aim  is  still  to  integrate  with  LADAR  the  FLIR  algorithms  we  have 
developed  and  reported  on  previously.  However,  such  an  integration  will  only  come  about  after 
we  have  stabilized  the  set  of  algorithms  for  LADAR  processing.  In  the  meantime,  we  have  con¬ 
tinued  to  improve  the  different  aspects  of  FLIR  processing.  Since  image  segmentation  is  a  criti¬ 
cal  step  in  this  processing,  we  implemented  a  totally  different  approach  to  this  problem  -  seg¬ 
mentation  by  tree  traversal.  Our  motivation  was  simply  to  compare  three  or  four  different 
approaches  to  segmentation  and  to  see  what  techniques  might  be  best  suited  for  FLIR.  This 
work  is  reported  in  Section  2.2.2. 

1.1.3  The  Use  of  High  Level  Reasoning  to  Improve  the  Classification  of  FLIR  Data 

Because  of  the  emphasis  that  we  placed  on  the  feature  classifiability  study,  our  progress  on 
the  application  of  hierarchical  vision  techniques  to  FLIR  images  has  been  less  than  what  was 
originally  anticipated.  We  have  conceived  of  methods  to  carry  out  hierarchical  reasoning,  but 
haven’t  fully  implemented  any  particular  strategies. 

Data  Structures  for  Symbolic  Reasoning 

We  took  an  important  first  step  toward  the  eventual  development  of  more  sophisticated 
algorithms  tor  ATR.  This  consisted  of  devising  procedures  for  converting  numerical  pixel  level 
information  in  an  image  into  a  symbolic  map.  In  Section  2.3.1,  we  have  shown  symbolic  data 
structures  that  will  be  used  for  associating  pixels  with  the  lowest  level  symbolic  features,  such 
as  lines,  edges  and  blobs.  It  is  important  that  efficient  data  structures  be  put  into  place; 
since  if  that  is  not  done,  simple  questions  like  what  edges  a  particular  pixel  might  belong  to 
can  lead  to  exhaustive  and  grossly  inefficient  searches  in  an  image.  An  important  side 
benefit  of  a  good  data  structure  is  that  it  can  reduce  some  types  of  elementary  symbolic  reason¬ 
ing  to  simple  operations  such  as  a  table  look-up. 

Pixel  Level  Hierarchical  Data  Structures 

In  hierarchical  vision,  we  conducted  some  studies  on  the  loss  of  classifiable  information  as 
we  go  up  a  pyramid  representation  of  a  FLIR  image.  The  pyramid  representation  was  obtained 
by  simple  4x4  averaging.  To  our  surprise,  we  were  for  the  most  part  unable  to  see  any 
appreciable  change  in  classification  error  as  we  moved  up  the  pyramid.  To  us  this  means 
that  at  this  time  there  is  probably  a  mismatch  between  the  resolution  implied  by  the 
matrix  sizes  used  for  FLIR  images  and  the  intrinsic  FLIR  sensor  resolution.  This  material 
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appears  in  Section  2.3.2. 

Global  Map  to  Improve  Edge  Labeling 

Data  Structures  for  symbolic  reasoning  have  been  proposed.  Under  development  in  the 
Robot  Vision  Lab  is  a  general  purpose  software  tool  for  integrating  map  knowledge  with 
images.  This  system,  called  PSEIKI  (a  Production  System  Environment  for  Integrating 
Knowledge  with  Images),  is  briefly  described  in  Section  2.3.3.  The  symbolic  reasoning  struc¬ 
tures  previously  developed  will  be  used  in  conjunction  with  this  system  for  applying  spatial  rea¬ 
soning  techniques  to  FLIR  images. 


1.2  Laser  Radar  Range  Data  Processing 

Data  Descriptions 


Our  second  major  area  of  accomplishment  was  getting  a  handle  on  the  various  aspects  of 
the  A.  P.  Hill  LADAR  data.  We  now  understand  the  nature  of  the  noise  in  the  AM  and  the  FM 
pan  of  the  data  and,  since  the  AM  noise  has  much  lower  variance  than  the  FM  noise,  we  can 
now  separate  from  the  high-noise  composite  data  supplied  to  us  a  version  whose  noise- 
properties  are  substantially  the  same  as  that  of  the  AM  part  alone. 

In  Section  3.3,  we  have  reported  on  our  preliminary  processing  of  the  A.  P.  Hill  data.  This 

data,  which  is  a  composite  of  absolute  range  information  through  the  FM  channel  and  relative 

range  information  obtained  through  the  AM  channel,  has  some  peculiar  noise  characteristics 

since  the  noise  variances  of  the  two  channels  are  very  different.  While  the  variance  of  the  FM 

channel  is  around  9  meters  -  this  happens  to  be  close  to  half  of  the  ambiguity  interval  of  18.75 

meters  -  the  variance  of  the  AM  channel  is  only  about  a  meter.  The  composite  data  therefore 

suffers  from  the  worst  of  the  two  variances.  Section  3.3  shows  how,  for  the  purpose  of  target 

recognition,  it  might  be  possible  to  extract  from  the  composite  data  a  range  map  whose  noise 

* 

properties  are  as  good  as  that  of  the  AM  channel.  Another  purpose  of  that  section  is  to  show 
that  in  this  trimester  definite  progress  was  made  by  us  in  gaining  a  full  understanding  of  the  A. 
P.  i  iill  data. 

1.2.2  7-  'nation  of  LADAR  Images 


*  Since  )n  3.3  was  first  written,  the  AM-only  data  has  been  made  available  to  us, 
therefore  lei  i~ing  the  need  to  extract  the  AM  data  from  the  composite  data. 


1-7 


Classifiability  vs.  Range  Experiments 

Our  preliminary  work  on  the  classification  of  LADAR  imagery  was  extended  to  include 
the  effect  of  range.  The  rationale  for  the  study  was  that  a  most  important  characteristic  of  any 
LADAR  algorithm  is  the  nature  of  degradation  of  its  performance  with  increasing  range.  Since 
only  noiseless  synthetic  data  was  used  in  the  preliminary  work,  the  classification  accuracies  we 
obtained  were  unrealistically  high.  Noisy  synthetic  data  -  which  is  a  more  accurate  representa¬ 
tion  of  the  real  world  case  -  was  later  processed  through  the  same  software  to  obtain  more 
meaningful  results.  The  range  dependence  of  classification  accuracy  is  reported  in  Section 
3.2.2. 

Optimal  Sampling  of  the  Feature  Space 

Since  at  large  distances  from  a  LADAR  sensor  it  is  unlikely  that  geometrical  features,  such 
as  relationships  between  different  surfaces,  would  be  discernible,  target  recognition  would  have 
to  depend  upon  silhouette  information.  A  silhouette  based  recognition  strategy  is  made  compli¬ 
cated  by  the  fact  that  silhouette  features  vary  considerably  over  the  range  of  all  possible 
viewpoints.  To  get  around  this  difficulty,  the  usual  practice  is  to  represent  the  space  of  all 
viewpoints  by  a  small  set  of  distinguished  viewpoints  such  that  each  viewpoint  in  the  small  set 
corresponds  to  a  Gaussian  distribution  for  the  silhouette  parameters  of  interest  and  that  the 
Gaussian  distributions  for  all  the  viewpoints  are  as  different  as  possible.  This  selection  of  dis¬ 
tinguished  viewpoints  has  hitherto  been  done  by  a  human  on  the  basis  of  his  intuitive  under¬ 
standing  of  the  dependence  of  silhouette  features  on  viewpoints. 

We  made  a  first  attempt  at  automatic  selection  of  these  distinguished  viewpoints  by  first 
computing  the  first  order  probability  density  associated  with  a  silhouette  feature  of  interest. 
This  density  was  computed  using  a  large  number  of  silhouettes  uniformly  sampled  around  the 
object.  We  then  sought  a  small  number  of  silhouettes  that  would  allow  us  to  compute  the  same 
density  function  with  minimum  error.  An  alternative  to  this  method  is  to  use  clustering  in  the 
silhouette  space.  For  that  purpose,  we  conducted  a  survey  of  the  various  automatic  clustering 
procedures  that  are  available.  This  brief  survey  is  included  in  Section  3.2.3.7.  An  immediate 
practical  usefulness  of  this  work  is  that  it  would  allow  us  to  greatly  reduce  the  number  of 
training  images  needed  to  train  a  silhouette  based  classifier.  This  work  is  reported  in  Sec¬ 
tion  3.2.3. 

Target  Detection  Experiments 

We  developed  an  algorithm  for  target  detection  from  single  LADAR  lines.  This  detection 
algorithm,  described  in  Section  3.2.4,  is  more  sophisticated  than,  and  subsumes,  simple  schemes 
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that  base  decisions  on  the  presence  of  constant  range  lines  between  two  range  discontinuity 
points.  We  say  our  algorithm  subsumes  simple  approaches  because,  with  appropriate  settings  in 
the  software,  the  algorithm  can  be  made  to  base  detection  decisions  on  mere  presence  of  con¬ 
stant  range  lines  or,  for  that  matter,  even  sloping  range  lines  for  targets  whose  flat  surfaces  are 
at  angles  other  than  90°  with  respect  to  the  angle  of  look.  Before  the  detection  algorithm  can  be 
used,  it  must  be  trained  on  sample  data;  the  statistical  similarity  between  the  backgrounds 
shown  to  the  detector  in  the  training  phase  and  the  background  in  the  test  phase  then  becomes 
an  important  determinant  of  detector  performance.  In  Section  3.2.4,  we  have  also  shown  that 
the  detector  performance  can  be  improved  by  the  enforcement  of  constraints  like  the  minimum 
number  of  detected  pixels  that  must  be  contiguous  for  a  target  to  be  declared  present. 

We  then  generalized  the  single-line  LADAR  target  detection  algorithm  to  the  case  of 
multi-line  input.  This  work,  discussed  in  Section  3.2.4,  is  aimed  at  examining  the  premise  that  it 
should  be  possible  to  improve  the  detector  performance  by  combining  data  from  different 
LADAR  scan  lines.  We  have  examined  two  schemes  for  the  multi-line  case:  In  the  first  scheme 
if  a  decision  is  based  on  L  adjacent  lines  and  M  pixels  from  each  line,  we  simply  treat  the  detec¬ 
tion  problem  as  testing  a  binary  hypothesis  in  an  MxL  dimensional  space.  In  the  second  scheme, 
the  L  lines  for  the  multi-line  case  are  considered  to  constitute  L  independent  detectors,  the  sub¬ 
detectors  for  each  line  working  in  exactly  in  the  same  manner  as  the  single  line  detector 
described  in  Section  3.2.4.  Using  both  these  methods  our  conclusion  is  that,  from  the  standpoint 
of  enhancing  target  detection  and  minimizing  false  alarms,  it  is  better,  if  possible,  to  have  all  the 
MxL  samples  in  a  single  line,  as  opposed  to  being  distributed  over  L  lines.  We  have  demon¬ 
strated  that  detectors  utilizing  a  few  LADAR  lines  may  not  be  practical  because  of  excessively 
large  false  alarm  rates  associated  with  them. 

1.2.3  Low  Level  Processing  of  LADAR 

In  Section  3.4.1,  we  will  describe  in  detail  the  low  level  processing  that  is  required  before 
any  geometrical  reasoning  strategies  can  be  invoked. 

Edge  Detection  &  Surface  Labeling 

In  the  first  trimester  of  1988,  we  discovered  that  the  traditional  approach  to  LADAR  seg¬ 
mentation,  which  starts  with  the  extraction  of  jump  and  curvature  edges,  is  too  sensitive  to  the 
high  variance  noise  and  the  large  number  of  dropouts  that  characterize  real  LADAR  data. 
Therefore,  in  the  second  trimester  of  1988  we  focussed  on  the  development  of  a  new  segmenta¬ 
tion  algorithm,  which,  we  are  happy  to  report,  is  indeed  better.  To  compare  the  performance  of 
the  new  segmentation  algorithm  with  the  old,  we  had  to  develop  criteria  for  judging  the  quality 
of  a  segmentation. 
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Region  Growing  Approach 

To  carry  out  geometric  reasoning  over  LADAR  data,  one  must  first  extract  the  individual 
surfaces  of  the  visible  part  of  the  target.  In  our  previous  work,  this  was  done  with  the  help  of 
edge  detection  algorithms,  the  edges  being  mostly  range  jump  discontinuities,  roof-type  edges, 
and  curvature  maximas.  We  started  out  with  edge  detection  for  low  level  processing  because 
that  is  a  common  thing  to  do  in  industrial  3D  robot  vision  and  we  felt  that  we  should  first  try  the 
already  proven  approaches.  Our  experience  with  LADAR  data  has  shown  that  edge  detection 
may  not  be  the  best  approach  for  LADAR  especially  when  such  data  is  characterized  by  high 
variance  noise  and  frequent  dropouts.  We  therefore  implemented  a  region  growing  approach  to 
identifying  the  target  surfaces.  We  report  our  new  LADAR  segmentation  procedure  in  Section 
3.3.2.  It  owes  its  superior  performance  in  part  to  the  fact  that,  prior  to  the  computation  of  sur¬ 
face  normals,  we  fit  2-D  B-splines  to  the  range  map.  As  a  result,  we  obtain  bicubic  approxima¬ 
tions  to  object  surfaces  that  are  guaranteed  to  be  continuous  in  first-  and  second-order  deriva¬ 
tives.  This  leads  to  a  great  deal  of  noise  suppression  and  results  in  smooth  range  maps  and  high 
quality  segmentations.  In  Section  3. 3.2.3,  we  show  the  results  on  synthetic  data.  Results  on  the 
real  A.  P.  Hill  LADAR  data  are  shown  in  Section  3. 3. 2.4.  In  Section  3. 3.2.5,  we  then  present 
objective  measures  for  testing  the  quality  of  a  segmentation  and  report  on  an  algorithm  evalua¬ 
tion  experiment  in  Section  3.3. 2.6. 

Study  of  Five  LADAR  Segmenters 

In  Section  3.3.3  we  have  presented  a  comparison  of  the  following  five  different  segmenta¬ 
tion  algorithms  for  LADAR  imagery: 

1.  Planar-Patch  Fitting  Error 

2.  Variance-Based  using  3x3  and  5x5  Windows 

3.  Rockwell  Algorithm 

4.  Nettleton  method  using  3x3  and  5x5  Windows 

5.  Variance-Less-One  using  3x3  Window 

The  last  algorithm  represents  a  heuristic  fix  for  the  problems  caused  by  noise  spikes  in  the  other 
algorithms.  Our  comparison  illustrates  the  sensitivity  of  each  algorithm  to  the  thresholds 
selected  for  segmenting  out  the  object  from  the  background.  These  silhouette-based  algorithms 
are  applicable  to  the  present  low-resolution  imagery  and  images  with  distant  targets,  where 
geometric  approaches  are  not  appropriate  due  to  lack  of  sufficient  detail. 
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Classification  Results 

Section  3.3.4  reports  on  some  preliminary  work  on  the  classification  of  targets  in  LADAR 
imagery.  Since  the  silhouettes  of  the  segmented  outputs  from  LADAR  data  seem  to  be  of  good 
quality  in  many  cases,  in  our  initial  classification  work  we  have  chosen  to  use  features  identical 
to  those  extracted  from  FLIR  silhouettes.  Although  the  classification  results  thus  appear  to  be 
promising  -  the  accuracy  achieved  was  over  90%  on  one  set  of  data  -  we  want  to  impress  upon 
the  reader  that  ultimately  LADAR  classification  must  exploit  the  geometrical  information  con¬ 
tained  in  range  maps.  What  we  are  trying  to  say  is  that  given  objects  in  arbitrary  orientations 
with  respect  to  the  sensor,  a  classification  strategy  would  work  best  if  it  is  based  on  matching  in 
a  relational  sense  the  geometrical  features  extracted  from  a  LADAR  image  with  the  geometri¬ 
cal  features  extracted  from  object  models,  such  matching  being  graph-theoretic  in  nature. 

1.2.4  High  Level  Processing  of  LADAR 

One  of  our  major  accomplishments  was  the  development  of  a  set  of  algorithms  for  recog¬ 
nizing  targets  in  LADAR  imagery  on  the  basis  of  geometrical  features.  We  are,  of  course,  aware 
of  the  fact  that  with  the  sensor  resolution  available  at  this  time,  geometrical  features  will  not  be 
discernible  for  targets  farther  away  than,  say,  a  kilometer.  However,  as  with  most  technologies, 
we  can  expect  the  sensor  resolution  to  improve  over  the  next  few  years,  especially  since  there 
appear  to  be  no  fundamental  reasons  to  preclude  that.  Therefore,  the  algorithms  we  are 
developing,  although  applicable  currently  to  close-range  targets,  are  really  aimed  more  for  the 
future.  The  initial  set  of  algorithms  discussed  in  this  report  is  only  meant  to  be  an  educational 
exercise  to  help  us  formulate  our  ideas  on  how  one  should  reason  over  geometrical  features; 
therefore,  these  algorithms  only  analyze  range  images  from  a  single  viewpoint. 

Production  Systems  for  Target  Recognition 

In  Section  3.4.1,  we  have  reported  on  some  novel  reasoning  strategies  for  drawing  infer¬ 
ences  about  a  target  using  geometrical  features  derived  from  LADAR  data.  In  one  of  these 
novel  strategies,  the  system  performs  default  reasoning,  which  can  best  be  explained  in  the  fol¬ 
lowing  manner:  Let’s  say  that  from  a  given  viewpoint  a  LADAR  sensor  is  able  to  see  N  sur¬ 
faces.  However,  because  of  data  acquisition  and  processing  limitations,  the  target  recognition 
program  is  only  able  to  discern  M  surfaces,  where  M  <N.  If  the  geometrical  characteristics  of 
these  M  surfaces  and  their  spatial  inter-relationships  are  the  same  as  those  of  some  M  of  the  N 
surfaces  expected  to  be  seen  on  the  target,  we  want  our  computer  program  to  declare  the  target 
present,  albeit  with  a  reduced  probability.  As  we  have  shown,  the  computation  required  for  this 
kind  of  a  recognition  strategy  is  vastly  simplified  if  we  assume  defaults  for  the  missing  object 
surfaces.  We  will  show  how  these  defaults  are  automatically  generated  in  two  different 
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computational  paradigms,  one  based  on  Prolog  and  the  other  on  OPS. 

Multi-Resolution  Data  Structure  for  Model-Based  Geometric  Reasoning 

Availability  of  LADAR  data  has  opened  the  door  to  recognition  of  objects  by  geometric 
reasoning.  However,  one  must  first  contend  with  the  issue  of  a  target  being  in  any  of  an  infinity 
of  possible  poses  with  respect  to  the  sensor.  Researchers  have  advanced  the  notion  of  aspect 
graphs  to  deal  with  this  difficulty.  The  nodes  of  an  aspect  graph  represent  the  clustering  of  all 
viewpoints  into  a  small  number  on  the  basis  of  topological  equivalences.  The  idea  behind  the 
use  of  aspect  graphs  is  that  given  a  target  at  an  unknown  orientation  with  respect  to  the  sensor, 
we  should  first  determine  the  aspect  graph  node  to  which  the  target  data  corresponds;  we  should 
then  invoke  node-specific  strategies  for  a  more  precise  determination  of  the  orientation  with 
respect  to  the  sensor.  In  the  context  of  LADAR,  since  not  all  geometrical  features  are  equally 
visible  from  different  ranges,  we  must  use  a  hierarchy  of  aspect  graphs  instead  of  using  a  single 
aspect  graph  for  a  target.  This  has  led  to  the  notion  of  a  multi-resolution  aspect  graph  reported 
in  Section  3.4.2  By  using  the  TWIN  solid  modeling  system,  we  are  now  able  to  generate  multi¬ 
resolution  aspect  graphs  for  targets. 

Another  major  accomplishment  concerns  the  problem  of  how  to  incorporate  the  dimin¬ 
ished  sensor  resolution  into  a  target  model  as  the  model  is  moved  away  from  the  sensor.  This  is 
a  hard  problem,  not  generally  amenable  to  analytical  solution  in  its  full  three  dimensional  form. 
We  have  shown  results  from  a  scheme  that  deleted  object  surfaces  on  the  basis  of  their  visible 
areas  as  the  object  was  moved  away.  Later  we  felt  that  since  much  recognition  is  driven  by  edge 
information,  and  since  the  ability  of  a  sensor  to  discern  an  edge  is  a  function  of  the  dihedral 
angle  at  the  edge,  and,  further,  since  the  measurement  of  a  dihedral  angle  suffers  as  the  object  is 
moved  away,  we  needed  to  capture  this  effect  in  our  model  degradation  process.  An  edge- 
deletion  based  scheme  for  graceful  degradation  of  a  target  model  as  the  model  is  moved  away 
from  the  sensor  is  reported  in  Section  3.4.2,  where  we  have  also  shown  results  on  a  couple  of 
different  targets.  As  a  target  is  moved  farther  away  from  a  sensor,  its  edges  are  removed  selec¬ 
tively  on  the  basis  of  one  or  both  of  edge-length  and  dihedral  angle. 


1.3  Electronic  Terrain  Board  Modeling 

One  of  our  most  significant  accomplishments  was  the  development  of  the  software  for 
Electronic  Terrain  Board  Modeling.  We  believe  that  ultimately  any  such  software  will  have  to 
have  the  following  features:  1)  geometric  models  of  targets,  2)  models  of  background  and  fore¬ 
ground  clutter,  3)  models  of  environmental  conditions,  and  4)  validation  routines  for  testing  the 
integrity  of  the  simulated  data.  We  have  succeeded  in  developing  the  software  for  all  four  of 
these  items.  We  are  able  to  convert  wire-frame  models  into  solid  models  using  PADL 
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descriptions.  The  PADL  solid  descriptions  are  then  used  to  generate  synthetic  range  maps.  We 
model  the  background  terrain  in  3-D  by  using  fractals.  Finally,  a  production  system  is  used  for 
modeling  clutter  such  as  trees.  One  advantage  of  our  procedure  for  generating  trees  is  that  every 
tree  can  be  different,  which  is  unlike  some  of  the  other  terrain  modeling  systems.  Finally,  to  the 
resulting  synthetic  3-D  imagery  we  add  noise  with  the  same  drop-off  statistics  as  the  real 
LADAR  data.  All  this  work  is  reported  in  Section  4.1. 

Using  the  PADL  based  system,  we  were  able  to  create  ground-truth-images  of  the  1987 
A.P.  Hill  field  test.  This  was  done  by  using  the  target  location  information  in  the  headers  of  the 
field  test  images  as  input  to  our  ETBM.  The  ETBM  placed  the  target  models  as  the  real  targets 
were  placed  in  the  field  test  and  then  generated  the  corresponding  image.  These  images  proved 
to  be  very  useful  in  helping  us  understand  the  location  and  relative  position  of  the  targets  in  the 
real  images.  Another  advance  was  that  we  acquired  the  BRL  solid-modeler.  We  attempted  to 
integrate  it  with  the  original  ETBM  software.  It  appeared  at  the  time  that  the  integration  of  the 
BRL  software  with  the  rest  of  our  system  was  a  necessary  pre-requisite  before  we  could  show 
more  sophisticated  synthetic  imagery.  The  PADL  solid  modeling  system  we  had  used  was 
unable  to  handle  a  large  number  of  surfaces  that  we  need  for  our  ETBM  modeling.  We  had 
hoped  this  difficulty  would  be  alleviated  by  the  BRL  software. 

In  Section  4.2,  we  show  that  with  the  TWIN  solid  modeler  we  were  able  to  generate  more 
complex  ETBM  imagery.  We  also  discuss  the  conversion  of  BRL  models  to  TWIN  and  talk 
about  the  difficulties  we  ran  into  in  this  regard;  these  difficulties  owe  their  origins  to  the  toler¬ 
ance  problem  in  solid  modeling. 


2.  FLIR  PROCESSING 


In  this  chapter,  we  will  discuss  our  work  on  FLIR.  Much  of  work  in  FLIR  has  been 
motivated  by  our  concerns  about  the  low  information  content  of  features  extracted  from  FLIR 
data,  especially  when  targets  and  terrain  are  more  than  a  kilometer  from  the  sensor  and  when 
the  atmospheric  conditions  are  less  than  ideal.  Notwithstanding  this  concern,  it  remains  that 
FLIR  being  passive  is  an  excellent  sensor  for  monitoring  battlefield  activity.  The  overall  goal 
for  the  research  community  therefore  is  how  to  best  exploit  this  potential  of  FLIR  with  the 
available  discriminatory  power  of  the  current  sensors.  In  the  work  we  have  reported  in  this 
chapter,  we  will  focus  on  the  classifiability  aspects  of  FLIR  features,  on  the  segmentation  pro¬ 
cedures  that  appear  to  work  well  without  being  excessively  demanding  on  computational 
resources,  and  on  data  structures  that  facilitate  high  level  reasoning  in  such  images. 

2.1.  CLASSIFIABILITY  OF  FLIR  FEATURES  AND  METRICS 

In  1987  we  examined  a  large  amount  of  FLIR  data  taken  under  different  conditions.  Since 
even  the  best  of  data  did  not  inspire  confidence  in  us,  we  decided  to  set  up  a  conjecture  about 
the  information  content  of  FLIR  features;  we  of  course  maintained  an  open  mind  about  the 
truthfulness  or  falsity  of  the  conjecture.  As  the  following  section  demonstrates,  the  conjecture 
was  tested  by  using  techniques  based  on  Parzen  estimation  theory,  which  is  capable  of  yielding, 
in  a  non-parametric  manner,  lower  and  upper  bounds  on  the  classification  error.  Although,  on 
account  of  dimensionality  issues  we  were  not  able  to  test  the  conjecture  in  the  full  feature  space, 
we  did  examine  the  appropriate  subspaces  and  arrived  at  the  conclusion  that  was  not  sufficient 
information  contained  in  the  features  used  by  most  contractors  at  this  time  to  warrant  hardware 
implementations  of  FLIR-based  ATR. 

2.1.1.  Information  Content  of  Fiir  Features  for  Interclass  Separation 

As  mentioned  in  the  Introduction,  our  basic  aim  here  is  to  test  the  following  conjecture 

In  a  typical  single  static  FLIR  image,  there  does  not  exist  enough  information  for 
accurate  target  classification. 

The  conditions  on  the  conjecture  are  that  the  single  frame  should  be  typical  of  FLIR  imagery 
(likely  to  be  recorded  under  actual  conditions)  and  definitely  excluded  are  close-range  FLIR 
images  and  images  recorded  under  ideal  environmental  conditions. 

We  state  this  conjecture  because  typical  FLIR  imagery  is  characterized  by  low  resolution; 
another  reason  for  our  conjecture  is  the  strong  dependence  of  FLIR  target  signatures  on  environ¬ 
mental  conditions.  These  factors  cause  us  to  question  the  viability  of  statistical  classification 
methods  applied  to  features  extracted  from  single  frames. 

By  processing  single  static  frames,  we  mean  that  no  context-based,  time-sequential, 
expectation-driven  or  knowledge-based  processing  is  performed. 
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To  test  the  conjecture  we  are  compiling  a  superset  of  features  that  are  used  by  the  NVL 
contractors  and  have  developed  a  software  package  that  for  any  given  set  of  features  computes 
upper  and  lower  bounds  on  their  interclass  separability.  This  is  done  via  the  computation  of  pro¬ 
bability  of  classification  error  through  Parzen  and  advanced  Parzen  estimation  techniques. 
Obviously,  how  much  classifiability  information  is  contained  in  a  set  of  features  can  be  readily 
determined  by  calculating  for  a  given  set  of  features  the  upper  and  lower  bounds  on  the  proba¬ 
bility  of  classification  error. 

As  is  well  known,  a  common  technique  used  for  target  classification  of  FLIR  images  is  to 
segment  the  image  into  regions  and  extract  various  features  (such  as  mean  gray  value,  region 
height,  and  region  width)  for  each  region  and  based  on  these  features  statistically  classify  the 
region  as  one  of  the  targets.  By  using  this  procedure  on  known  data,  one  can  determine  the 
extent  of  classifiability  information  contained  in  a  set  of  features.  The  basic  steps  of  our 
classification  study  are: 

1.  Segment  FLIR  images  so  that  each  region  contains  a  single  target. 

2.  Extract  a  superset  of  the  features  that  have  been  revealed  to  us  by  NVL  contractors. 

3.  Use  the  Parzen  and  advanced  Parzen  estimates  to  compute  the  underlying  density  func¬ 
tions  for  the  different  classes. 

4.  The  extent  of  overlap  between  the  density  functions  corresponding  to  different  classes  is  a 
measure  of  interclass  separability.  Compute  upper  and  lower  bounds  on  interclass  separa¬ 
bility. 

The  next  section  discusses  the  automatic  segmenter  we  used.  Section  2. 1.1. 2  presents  in 
detail  the  features  used  in  the  experiments.  A  brief  review  of  the  Bayesian  decision  process  and 
Bayesian  error  estimation  via  the  Parzen  and  advanced  Parzen  density  estimates  follow  in  Sec¬ 
tion  2. 1.1. 3.  Preliminary  results  on  interclass  separability  based  on  the  Eglin  Turntable  Data, 
NVL  terrain  board  data,  and  the  BRITT  data  are  given  in  Section  2. 1.1.4.  Finally,  future  exper¬ 
iments  are  discussed  in  Section  2. 1 . 1 .5. 

2.1. 1.1.  Segmentation 

In  order  to  obtain  better  target  silhouettes  for  more  accurate  feature  calculation  and  there¬ 
fore  more  meaningful  classification  results,  we  implemented  the  likelihood  segmenter  described 
in  the  Bandwidth  Reduction  and  Intelligent  Target  Tracking  (BRITT)  Phase  One  Final  Report 
by  Hughes  Aircraft  Company’s  Electro-Optical  &  Data  Systems  Group  [Hughes84].  We 
present  here  (see  Figure  2.1.1)  the  target  silhouettes  produced  by  the  segmenter  for  images  with 
various  qualities,  but  due  to  the  proprietary  nature  of  the  Hughes  report  we  will  not  discuss  the 
segmentation  algorithm  itself. 

The  input  images  we  used  are  from  the  BRITT  Target  Recognizer  Classifier  Training  data 
set.  The  best  segmentation  output  was  produced  for  images  similar  to  those  in  Figure  2.1.1  (a). 


(a)  briu040  :  type=APC,  range=5km,  aspect=45deg 
small  "hot"  target 


(d)  britt029  :  type=  TRUCK,  range=2.5km,  aspect=180deg 
large  "cool"  target 


(e)  britt003  :  type=APC,  range=5km,  aspect=  1 80deg 
no  visible  target  where  there  should  be  one 


(0  britt5)5  :  typc=APC,  nmgc=5km,  aspcct=270deg 
noisy  target 


Figure  2.1.1  Continued. 


(g)  britl347  :  type=TANK,  range=2.5km,  aspect=45deg 
noisy  background 
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(b),  and  (c). 

2.I.I.2.  Feature  Extraction 

Feature  extraction  is  a  critical  step  in  ATR  because  the  selection  of  features  greatly 
influences  the  ability  to  classify.  We  are  currently  using  a  superset  of  features  used  by  other 
NVL  contractors. 

The  following  description  of  the  features  assumes  that  G  (x,y)  is  the  grey  level  value  of  the 
image  containing  the  target,  and  T  is  the  target  region  in  the  xy  plane. 

1.  Mean  grey  value  of  the  target 

e-  "]4t  £  0(x,y) 

' 1  (jt.yieT 

2.  Standard  deviation  of  the  target  grey  value 

a=\llh  I  Ghx,y)-g2 

\  11  1  (x,y)cT 

3.  Target  height 

hj  =  max  (jt()-min(jc,)  for  all  jc,eT 

4.  Target  width 

wj  =  max  (y,)-min(y,)  for  all  y^T 

5.  Minimum  grey  value  of  the  target 

m  =  min  {G  (x,y )}  for  all  {x,y  )eT 

6.  Maximum  grey  value  of  the  target 

M  =  max  (G (x,y)}  for  all  ( x,y)zT 

7.  Area  of  the  target 

^  =  I  {(x,y):  ( x,y)eTJI 

8.  Second  and  third  order  moment  invariants 

9.  Maitia’s  beta  functions  (six  of  them) 

10.  Height  to  width  ratio  of  the  target 


Wj 
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1 1.  Perimeter  to  width  ratio  of  the  target 

perimeter 

P pw - 

v'Y 

12.  Rectangularity  measure 

TI02TI20  _Tlii 

where  TI02,  T}20’  Tin  are  normalized  central  moments. 

1 3.  Square  of  width  to  height 

(wf/h-f) 

14.  Normalized  contrast 

( g-b)/a 

where  b  is  the  background  average. 

15.  Range 

1 6.  Depression  angle 

a  =  sin-1  ( elevation  /  range ) 

17.  Square  of  the  perimeter  over  the  area 

( perimeter )2  /  A 

18.  Square  of  the  height  over  the  area 

hjIA 

19.  Height  times  range  squared 

{hj*  Range)2 

20.  Area  times  Range  squared 

A* Range2 

2 1 .  The  sign  of  the  normalized  contrast 

if  (g-b)/o> 0 
ig-b)la< 0 

We  rederived  all  of  the  features  used  by  Martin  Marietta  to  check  for  correctness.  Detailed 
derivations  of  the  features  are  presented  in  Appendix  C.  These  derivations  uncovered  an  incon¬ 
sistency  in  one  of  Hu’s  invariants.  This  error  was  traced  back  to  a  typographical  error  in  Hu’s 
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original  paper  [Hu62].  Further  study  showed  that  this  error  had  been  reported  in  [Ma79],  The 
feature  in  error  was  r\pq  which  was  given  as: 


- 


p+</ 

Moo  2 


but  should  have  been: 


^p<f  ~ 


PZ ! 

Moo  2 


+i 


Unfortunately  r\p<i  appears  in  all  of  Hu’s  invariant  moments,  so  none  of  the  invariant  moments 
computed  by  Martin  Marietta  could  have  been  correct. 


Maitra’s  invariants  (derived  in  Appendix  C)  are  invariant  under  scale,  rotation,  transla¬ 
tions,  and  illumination .  These  invariants  were  used  in  [MM84],  however  when  $5  is  negative, 
P4  is  undefined.  We  have  found  that  the  following  new  definition  for  (54  is  more  stable: 


2.L1.3.  Classification  of  FLIR  Imagery 

The  classification  problem,  as  it  applies  to  FLIR  image ry  can  be  stated  as  follows:  Once 
an  object  is  detected  in  an  image,  it  is  usually  desired  to  determine  what  type  of  object  it  is.  To 
do  this,  a  vector  of  features,  X,  is  extracted  from  the  object  (e.g.  mean  grey  value,  various 
moments,  etc.).  The  vector  can  then  be  used  to  estimate  the  probability  that  the  object  belongs 
to  any  given  class.  Assuming  that  the  object  can  only  be  from  one  of  two  classes,  a)j  or  a>2,  the 
following  decision  rule  is  applied.  If  the  probability  that  the  object  is  from  class  1  is  greater 
then  the  probability  that  the  object  is  from  class  2,  the  object  is  assigned  class  1  membership; 
otherwise,  it  is  assigned  to  class  2.  This  can  be  stated  mathematically  as 


X 


(0,  P(Xeo)i  IX)  >  P(Xeo>2  IX) 
o>2  F(X€G>2lX)  >F(XecO!  IX) 


where  co,  stands  for  class  i,  and  X— >0),  indicates  that  X  is  classified  to  class  i.  The  decision  rule 
(for  the  one  dimensional  case)  is  shown  graphically  in  ngure  2.1.2. 

Bayes’  theorem  shows  a  way  to  find  the  a-posteriori  probability  of  P  yXsi o,  IX)  given  the 
a-priori  conditional  probability  P  (X I  Xe  0),).  Bayes’  theorem  can  be  expressed  as 

P  (X  I  X  6  CO,' )  •  P  (X  £  {0; ) 

P  (Xe  0);  I X)  = - 

P(X) 


Figure  2. 1 .2  Classification  Decision  Rule 
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If  we  define  the  likelihood  ratio  as 


/(*)  = 


P(X  IXecoj) 


P(X  IXea^) 
the  above  decision  rule  can  be  expressed  as 


X  ->  ^ 


Wi  -ln(/(X))<r 
o>2  -ln(/(X))>/ 


where  the  decision  threshold,  t,  is  defined  as 

Ptreco,) 
t  =  -In- 


(2.2.1) 


P(xea>2) 

It  can  be  shown  that  the  probability  of  error  for  the  Bayesian  decision  rule  is  the  area  of  the 
shaded  portion  of  the  graph  in  Figure  2.1.2.  The  decision  rule  can  be  extended  to  m  classes, 
coj ,  o>2,  ...,  com  which  is  expressed  as: 


CO, 


In (P  (Xe  co,  I X))  >  In (P  (Xe  co;  IX))  \  j  =  1,2 . m 


(2.1.2) 


if  t  =  0. 


2.1. 1.3.1.  The  Parzen  Density  Estimate 

If  the  distribution  of  the  features,  X,  is  known  or  can  be  determined  parametrically,  the 
problem  of  finding  the  classification  error  is  easy.  Unfortunately,  in  this  case  we  must  estimate 
the  density  function.  Given  N  samples,  Xj,  X2,  ...,  Xyv,  from  a  density  function,  the  Parzen 
estimate  of  a  density  function  can  be  defined  as 

p(X)=jjZ(\fhn)k((X-Xi)lh) 

where  k  ( • )  is  called  the  kernel  of  the  estimate  and  should  be  a  nonnegative  Borel  measurable 
function  satisfying  jk(X)dX  =  1.  An  example  of  a  one  dimensional  Parzen  estimate  is  shown  in 
Figure  2.1.3.  Each  sample  X,  is  shown  with  its  corresponding  kernel.  These  kernels  are 
summed  to  give  the  estimate  p(X).  It  should  be  noted  that  the  only  parameters  needed  for  the 
Parzen  density  estimate  are  the  form  and  size  of  the  kernel  function. 

The  Parzen  density  can  be  used  to  estimate  the  probability  of  error  in  the  following 
manner.  If  Nj  is  the  number  of  samples  of  class  j  and  N  is  the  total  number  of  samples, 
P  (Xe  co,)  can  be  approximated  with  Nj/N.  If  XtJ  is  the  i th  sample  from  the ;'th  class,  we  can  let 
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Pj(X)  =  P(Xeoij\X) 

P(X\Xe<Qi)  -  P(Xea>i) 

=  JpT) 


1  P(X6tO;)Ny 

'  ■'£(l/hn)kj({X-Xij)lh) 


P(X) 


N; 


removing  terms  that  will  appear  in  all  classes  yields  p;(X),  which  is  defined  by 

N, 

Pj(X)=^kj((X-Xij)/h)  j  =  1,2 . m  (2.1.3) 

i= 1 

If  we  substitute  Equation  (2.1.3)  into  Equation  (2.1.2)  we  can  classify 

co/ 1  p/(X)  >  P;(X)1  i  =  l,2 . m 


If  all  samples,  Xtj,  are  classified  in  this  manner  and  we  let  Nerror  be  the  count  of  misclassified 
samples,  the  probability  of  enror  can  be  approximated  by 

N error 


N 


It  should  be  noted  that  the  contribution  of  the  sample  itself  is  taken  into  account  when  the  pro¬ 
bability  that  it  belongs  to  its  true  class  is  being  computed;  this  corresponds  to  the  case  where 
the  classifier  is  designed  and  tested  using  the  same  data  set.  This  produces  the  so  called  resub¬ 
stitution  error  [Fu72],  It  can  be  shown  that  this  gives  a  lower  bound  on  the  true  probability  of 
error. 

To  get  an  upper  bound  on  the  probability  of  error,  one  can  use  the  leaving  one  out  method. 
Basically,  in  this  method  we  ignore  the  effect  a  sample  has  on  the  density  estimate  of  its  true 
class,  and  then  the  probability  of  error  is  computed  in  the  same  manner  as  before.  When  we 
leave  out  a  sample,  the  equation  for  estimating  the  sample’s  class  density  becomes 

Nj  f  I 

Pj(X)  =  X*,((X-X;,)//i)-*,(0)  L  where  X  e  co, 
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2.I.I.3.2.  Estimation  of  Classification  Error  in  FLIR  Data 


In  our  classification  experiments,  our  long  term  aim  is  to  find  the  error  when  classifying 
between  four  objects:  a  tank,  a  truck,  a  jeep,  and  clutter.  The  Parzen  estimation  procedure  will 
be  used  to  find  the  classification  error  of  a  Bayesian  scheme  using  commonly  used  features  from 
FLIR  images  (real  and  simulated).  Each  class  of  objects  will  be  split  into  two  clusters  (subc¬ 
lasses):  front/rear  view  and  side  view.  This  will  be  done  in  an  attempt  to  assure  the  classes 
have  a  Gaussian  distribution.  If  it  is  determined  that  resulting  clusters  are  still  not  Gaussianly 
distributed,  it  may  be  necessary  to  create  three  clusters  per  class:  front,  rear  and  side  views. 

Unfortunately,  to  produce  statistically  meaningful  results,  the  number  of  samples  per  clus¬ 
ter  must  be  at  least  an  order  of  magnitude  greater  then  the  dimensionality  of  the  data  [KaLa83]. 
This  is  needed  to  accurately  calculate  the  sample  covariance  matrices.  Because  we  are  going  to 
have  two  or  three  clusters  per  class,  if  we  use  approximately  ten  features  we  need  to  have 
200-300  samples  of  each  object  to  accurately  determine  the  Bayesian  probability  of  error. 


As  was  mentioned  in  the  previous  section,  the  only  parameters  needed  to  use  the  Parzen 
estimate  are  the  kernel  size  and  shape.  We  will  use  a  Gaussian  kernel  of  variable  size.  The 
Gaussian  kernel  is  used  frequently  in  Parzen  density  estimation,  and  can  be  shown  to  be  optimal 
for  classification  error  estimation  if  the  classes  have  Gaussian  densities.  The  Gaussian  kernel 
can  be  expressed  as 


kj{X~Y)  =  — - — - exp 

k  2  •  ■yj  I  Zj  1 


(X-Yflj-'iX-Y) 

1? 


where  n  is  the  dimensionality  (i.e.  the  number  of  features),  h  is  the  kernel  size,  L,  is  the  covari¬ 
ance  matrix  of  class  j,  and  ( X-Y)T  is  X-Y  transposed.  Removing  terms  that  will  appear  in  all 
the  kernel  functions  yields  the  final  kernel  function 


kj(X-Y)  =  exp 

w 


(X-Y)TLj-'  (X-Y) 
1? 


The  kernel  size  parameter,  h ,  will  be  varied  to  give  the  lowest  leaving-one-out  error,  this  gives 
the  optimal  kernel  size. 


2.I.I.3.3.  Advanced  Parzen  Error  Estimation  Techniques 

The  Parzen  error  estimation  technique  described  up  to  this  point  has  generally  been  con¬ 
sidered  state-of-the-art.  However,  recent  work  at  Purdue  has  been  performed  focusing  on 
improving  the  results  produced  by  the  Parzen  error  estimation  procedure  [FuHu87].  The  major¬ 
ity  of  this  work  has  been  centered  on  changing  the  decision  threshold  from  the  value  defined  in 
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Equation  2.2.1.  The  incentive  behind  this  work  is  the  realization  that,  under  certain  conditions, 
the  expected  value  of  the  estimated  density  with  respect  to  X  is 

E{pJ  =Pi(X)*(\/hn)k(X/h)  (2.1.4) 

where  *  represents  convolution  in  Rn.  This  equation  is  valid  if  the  covariance  of  the  kernel 
function  is  set  equal  to  the  covariance  of  the  data  samples,  £,.  If  this  is  done,  the  covariance  of 
the  scaled  kernel,  (1  lhn)k(Xlh),  is  given  by  As  can  be  seen  by  Equation  2.1.4,  the 

estimated  density  is  a  smoothed  version  of  the  true  density  function,  and  as  h  becomes  small, 
the  estimated  density  approaches  the  true  density.  However,  although  the  bias  of  the  estimate 
decreases  for  small  h,  the  variance  increases  rapidly.  Therefore,  the  choice  of  the  sample  scal¬ 
ing  factor  is  critical. 

The  bias  of  the  estimate  can  be  solved  for  explicitly  if  the  true  density  is  assumed  to  be 
Gaussian.  When  p,(X)  and  (1  lhn)ki(Xlh)  are  normal  densities  with  covariances  and  /i2£,, 
the  convolution  produces  another  normal  density  with  covariance  (1+/j2)£,-.  As  h  increases,  the 
variance  of  the  estimate  decreases.  Thus,  a  new  estimate  can  be  formed 

l2 

ln{p{X))  ~  ( l+h2)ln (p(X))  +  -r-ln (III) 

£ 

Using  this  new  estimate,  the  decision  threshold  in  Equation  2.2.1  can  be  changed  to 
I  P(Xea,)  hi  !£,  I 
'  l+h2  "P(Xem,)  20+h1)  "  15*1 

It  should  be  noticed  that  if  all  classes  have  the  same  number  of  samples  and  the  determinant  of 
the  covariances  of  all  classes  are  equal,  then  this  expression  is  equal  to  the  one  shown  in  Equa¬ 
tion  2.2.1. 

2.1. 1.4.  Preliminary  Classification  Results 

This  section  details  the  results  of  all  the  classification  experiments  performed  to  date.  One 
experiment  was  reported  in  the  first  report,  and  three  more  classification  experiments  were 
reported  in  the  second.  In  the  time  since  the  second  report,  one  further  experiment  has  been 
performed.  All  experiments  used  the  Parzen  classification  error  estimation  procedure  described 
in  Section  2.1. 1.3.  The  source  of  data  used  in  these  experiments  was: 

1)  Gaussian  data  with  known  interclass  overlap 

2)  Eglin  turntable  data. 

3)  Simulated  FLIR  from  the  NVL  terrain  board. 

4)  A  subset  of  the  targets  from  the  BRITT  database 

The  following  explains  each  of  the  experiments. 
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2.1. 1.4.1.  Experiment  1  -  Gaussian  Data 

The  experiment  on  the  first  set  of  data  was  conducted  to  test  the  performance  and  demon¬ 
strate  the  statistical  validity  of  the  Parzen  error  estimation  technique.  In  this  experiment,  the 
procedure  was  performed  on  Gaussian  data  with  known  Bayesian  error.  This  test  of  the  Parzen 
technique  was  performed  for  two  reasons.  First,  it  was  desired  to  see  qualitatively  how  tight  the 
upper  and  lower  bounds  produced  by  the  original  and  advanced  Parzen  error  estimates  were  to 
the  known  error.  Second,  it  was  also  desired  to  compare  the  results  with  the  upper  bound  given 
by  the  Bhattacharrya  [Fu72]  distance  and  a  parametric  classifier  for  Gaussian  data.  The  Bhatta- 
charrya  distance  is  a  special  case  of  the  Chemoff  bound  [Ch62]  and  is  commonly  used  to  esti¬ 
mate  the  upper  bound  of  the  error  probability.  It  was  desired  to  compare  the  two  methods  to  see 
if  the  Parzen  estimate  would  yield  a  tighter  upper  bound  then  the  one  provided  by  the  Bhatta¬ 
charrya  distance.  The  parametric  technique  for  classifying  Gaussian  data  is  known  as  a  qua¬ 
dratic  classifier;  this  classifier  which  should  give  the  best  results  possible  for  Gaussian  data. 
Thus,  the  error  bounds  produced  by  the  quadratic  classifier  are  a  good  standard  by  which  to 
measure  the  non-parametric  techniques’  performance. 

Two  separate  sets  of  data  were  used  in  this  experiment.  The  first  data  set  tested  perfor¬ 
mance  when  the  means  of  the  distributions  of  the  classes  differed.  The  second  data  set  tested 
performance  when  the  distributions  of  the  data  from  the  two  classes  had  different  covariance 
matrices.  See  Tables  2.1.1  and  2.1.2  for  the  parameters  of  the  data  for  the  two  experiments. 

2.1. 1.4.1.  Results 

Ten  trials  were  run  on  each  data  set,  with  results  being  reported  for  each  trial.  The  ten  tri¬ 
als  were  also  averaged  and  the  means  and  variances  have  also  been  reported.  The  experimental 
results  of  both  the  individual  and  averaged  trials  for  the  two  sets  of  data  are  shown  in  Tables 
2.1.1  through  2.1.4.  As  can  be  seen  from  the  tables,  the  upper  bound  sometimes  falls  under  the 
true  error  and  the  lower  bound  is  sometimes  slightly  larger  then  the  true  error.  This  is  due  to  the 
statistical  nature  of  the  data  and  occurs  in  all  types  of  error  estimation  schemes.  On  the  aver¬ 
age,  the  bounds  provided  by  the  Parzen  techniques  prove  to  be  very  tight.  This  can  be  seen 
when  the  results  of  the  Parzen  and  Bhattacharrya  upper  bounds  are  compared;  in  most  cases,  the 
Bhattacharrya  bound  reports  almost  twice  the  amount  of  actual  error,  while  the  output  of  the 
Parzen  techniques  closely  matches  the  qcudratic  output. 

2.1. 1.4.2.  Experiment  2  -  Eglin  Turntable  Data 

For  the  purpose  of  the  first  report,  preliminary  results  were  obtained  simply  to  demonstrate 
that  the  software  for  computing  interclass  separability  is  in  place.  These  preliminary  results 
were  obtained  by  using  the  Eglin  turntable  data.  The  data  consists  of  40  FLIR  images  of  one 
target  (a  .ank)  rotating  on  a  turntable.  The  images  were  segmented  by  hand  (to  reduce  the 
chance  of  segmentation  error)  and  the  features  discussed  in  [MM84]  were  extracted  for  each  of 


Table  2.1.1  Error  Bounds  of  10%  gaussian  data  using  Parzen  error  estimates. 

True  error:  10  % 

Dimension:  8 
Samples  per  Class:  100 
Covariances:  Ej  =  I,  E2  =  I 
means: 


Mj  = 


Table  2.1.2  Error  Bounds  of  9%  gaussian  data  using  Parzen  error  estimates. 

True  error:  9  % 

Dimension:  8 
Samples  per  Class:  100 
Covariances:  Ej  =1,  E2  =  41 
means: 


Mj  = 


Table  2.1.3  Error  Bounds  of  10%  gaussian  data  using  Quadratic  and  Bhattacharrya  estimates. 


true  error:  10% 

Trial 

Quadratic 

Bhattacharrya 

Upper 

Lower 

Distance 

0 

8.5 

7.5 

18.3 

1 

9.5 

7.5 

19.9 

2 

12.0 

9.5 

20.4 

3 

13.5 

11.0 

18.0 

4 

8.5 

8.0 

19.7 

5 

10.0 

8.5 

19.8 

6 

9.5 

8.5 

18.9 

7 

8.5 

7.5 

19.2 

8 

13.0 

12.0 

20.9 

9 

10.5 

9.0 

20.2 

avg  -  mean 

10.3 

8.9 

19.5 

avg  -  s.d. 

1.9 

1.5 

0.9 

Table  2.1.4  Error  Bounds  of  9%  gaussian  data  using  Quadratic  and  Bhattacharrya  estimates. 


true  error:  9% 

Trial 

Quadratic 

Bhattacharrya 

Upper 

Lower 

Distance 

0 

3.5 

2.0 

14.5 

1 

15.5 

12.0 

20.5 

2 

8.0 

5.0 

19.7 

3 

8.0 

7.0 

17.1 

4 

11.0 

8.5 

18.9 

5 

10.0 

7.0 

17.6 

6 

8.5 

8.0 

19.4 

7 

10.0 

7.0 

19.4 

8 

8.0 

5.5 

17.7 

9 

8.0 

7.0 

17.9 

avg  -  mean 

9.1 

■ 

18.3 

avg  -  s.d. 

3.0 

1.7 
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the  images.  These  features  were  examined  and  two  features  were  selected  to  be  used  in  each  of 
the  classification  experiments.  The  image  database  contains  only  40  images  of  one  target,  so 
the  classification  experiment  was  designed  to  classify  the  front  and  back  views  from  the  side 
view  of  the  tank.  Only  two  features  in  each  experiment  were  used  because  there  were  less  than 
20  images  per  class  and  at  least  10  samples  per  feature  are  needed  for  each  class  [ KaLa83]. 

Figure  2.1.4  shows  three  sample  views  from  the  Eglin  turntable  data,  and  the  correspond¬ 
ing  binary  segmented  images.  Table  2.1.5  shows  the  values  of  the  features  extracted  from 
images  in  Figure  2.1.4.  The  moments  presented  in  Table  2.1.5  were  computed  with  the  correct 
invariants,  as  discussed  in  Section  2. 1.1.2. 

The  results  of  six  classification  experiments  are  in  presented  in  Table  2.1.6.  The 
classification  errors  varied  from  0  to  25%  for  a  lower  bound,  and  0  to  37.5%  for  an  upper 
bound.  Due  to  the  very  limited  size  of  the  input  data,  the  only  real  conclusions  that  can  be 
drawn  from  this  classification  data  is  that  we  are  able  to  run  classification  experiments. 

2.I.I.4.3.  Experiment  3  -  Simulated  FLIR  ' 

The  data  set  for  the  third  experiment  was  generated  from  the  simulated  FLIR  obtained 
from  the  Night  Vision  Lab’s  terrain  board.  The  terrain  board  data  included  three  types  of  tar¬ 
gets:  APCs,  tanks  and  trucks.  In  the  database,  each  type  of  target  was  rotated  through  360 
degrees,  in  45  degree  increments.  There  were  eleven  images  of  each  target  for  each  target 
orientation,  however,  the  target  images  for  any  orientation  were  nearly  identical.  Each  image  in 
the  data  set  included  three  targets,  one  of  each  type  of  target.  Because  of  the  relatively  large 
size  and  high  definition  of  the  vehicles  in  this  database,  target  detection  and  identification  was 
extremely  easy. 

2.1.1.4.3.  Target  Classes 

Two  experiments  were  performed  on  the  terrain  board  data.  The  first  experiment  used  the 
three  classes  of  targets  mentioned  above.  The  second  experiment  was  performed  to  determine  if 
the  classifiability  of  the  targets  would  improve  if  subclasses  (clusters)  were  formed  based  on  tar¬ 
get  orientation.  In  this  experiment,  each  class  of  targets  was  split  into  two  clusters,  one  for  the 
front/rear  view  and  another  for  the  side  views. 

2.1. 1.4.3.  Features 

In  this  experiment,  three  different  feature  sets  were  used;  they  fall  in  the  following 
categories 


1)  segmentation  features 

2)  grey  scale  features 


Figure  2.1.4  Sample  images  from  Eglin  turntable  data,  (a)  Front  view  of  tank  (FILE01).  (b) 
Segmented  front  view,  (c)  Side  view  of  tank  (FILE  15).  (d)  Segmented  side 
view,  (e)  Back  view  of  tank  (FILE27).  (f)  Segmented  back  view. 


Table  2.1.5.  Features  extracted  from  images  in  Figure  2.1.4. 


feature 

front  view 
FILEOl 

side  view 

FILE  15 

back  view 
FILE27 

height 

45 

43 

44 

width 

47 

98 

49 

area 

1511 

2996 

1800 

height/width 

0.957447 

0.438776 

0.897959 

max 

140 

215 

210 

min 

59 

67 

62 

mean 

102.860359 

114.354805 

124.765556 

variance 

250.677017 

845.803650 

1963.733765 

sigma 

15.832783 

29.082705 

44.314037 

moments 

mOO 

1.5542e+05 

3.4261c+05 

2.2458e+05 

mlO 

9.8136e+06 

2.3239e+07 

1.5l23e+07 

mOl 

1.0338e+07 

2.4963e+07 

1.5022c+07 

m20 

6.3890c +08 

1.6087e+09 

1.0442c+09 

mil 

6.5201e+08 

1.6958e+09 

1.0108c+09 

m02 

7.081  le+08 

2.0093e+09 

1.0347e+09 

m30 

4.2751e+10 

1 . 1 344e+ 1 1 

7.3732c+10 

m21 

4.241  lc+10 

1.1747e+ll 

6.9740e+10 

ml2 

4.4663C+10 

1.3701c+ll 

6.9576C+10 

m03 

4.9827e+10 

1.7383e+1 1 

7.3235e+10 

central 

moments 

uOO 

1.5542e+05 

3.426  le+05 

2.2458e+05 

u20 

1.9255e+07 

3.241  le+07 

2.5/90c+07 

ul  1 

-7.4096e+05 

2.6444c+06 

-8.5466e+05 

u02 

2. 0482c +07 

1.9047e+08 

2.9887e+07 

u30 

-2.1256C+07 

-7.3833C+07 

-6.1042e+07 

u21 

8.0478e+06 

-9.4706c +07 

5.7593e+06 

ul2 

5.0408c+07 

3.3233e+08 

9.4191c+06 

u03 

1.71 1  le+06 

-3.2683c+08 

2.2330e+07 

normalized  1 

moments 

n20 

7.9712e-04 

2.7613c-04 

5.1 136c-04 

nil 

-3.0674c-05 

2.2530c-05 

-1.6946c-05 

n02 

8. 4 790c -04 

1 ,6227e-03 

5.9258e-04 

n30 

-2.2320e-06 

-1.0746e-06 

-2.5540c-06 

n21 

8.4507C-07 

-1.3784C-06 

2.4096e-07 

nl2 

5.2932c -06 

4.8370e-06 

3.9409C-07 

n03 

1.7%8c07 

-4.7570C-06 

9.3428e-07 

feature 


front  view 
FILE01 


side  view 
FILE  15 


back  view 
FILE27 


Hu’s 

invariants 

Pi 

p2 

p3 

p4 

pS 

p6 

p7 

3.9737e+07 
3.7010e+12 
3.0252e+16 
9.4508c+14 
-2.299 3e+30 
-1.7689e+21 
4.5000c+30 

2.2288c+08 

2.5010c+16 

1.1485c+18 

2.4451e+17 

1  2865c+35 
1.6371e+25 
-1.5427e+34 

5.5678e+07 

1.9705c+13 

8.0000C+15 

3.4540e+15 

3.5068e+29 

-2.7280e+21 

1.8153e+31 

Hu’s  normalized 
invariants 

n 

1.6450e -03 

1.8988C-03 

1.1039C-03 

f2 

6.3426e -09 

1.8152c-06 

7.7465e-09 

f3 

3.3358c-10 

2.4330e-10 

1 .4004e- 1 1 

f4 

1 ,0421e-l  1 

5.1 799e- 1 1 

6.0463e-12 

f5 

-2.7955e-22 

5.7737e-21 

1.0746e-24 

f6 

-8.0745e-16 

2.954  7e- 14 

-9.4682e-17 

a 

5.4712c-22 

-6.9235C-22 

5.5626c-23 

beta’s 

(Maitra’s) 

bl 

2. 3438c -03 

5.0347e-01 

6.3564C-03 

b2 

3.1971e+Ol 

7.059  le-02 

1.6376C+00 

b3 

3. 1240c -02 

2.1290C-01 

4.3175e-01 

b4  (new) 

-2.5743e+00 

2.1519e+00 

2.9394e-02 

b5 

-4.7102c-02 

3.0041e-01 

- 1 .4 1 85e-02 

b6 

-I.9571C+00 

-I.1991e-0I 

5.1766c+01 

Table  2.1.6.  Classification  error  estimates  based  on  Eglin  turntable  data. 


Experiment 

Features 

h 

Lower  Bound 

Error 

Upper  Bound 
Error 

1 

Height 

Width 

2.7500 

5.00% 

5.00% 

2 

Area, 

Height/Width  ratio 

3.5000 

5.00% 

5.00% 

3 

max  grey  level 
min  grey  level 

0.1000 

17.50% 

37.50% 

4 

mean  grey  level 
sigma  of  grey  level 

0.1500 

2.50% 

5.00% 

5 

betal 

beta2 

1.0000 

0.00% 

0.00% 

6 


beta6 


0.2500 


25.00% 


35.00% 
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3)  Maitra’s  beta  functions 

The  first  set  of  features  depends  only  on  the  target  segmentation;  that  is,  the  grey-scale  values  of 
the  target  do  not  affect  them  at  all.  These  features  include  parameters  such  as  the  target’s 
height,  width  and  area.  Statistical  parameters  of  the  target’s  grey-scale  values  comprised  the 
second  feature  set.  The  third  feature  set  consisted  of  the  Maitra’s  beta  functions.  Table  2.1.7 
shows  the  features  used  in  each  feature  set. 


Table  2. 1 .7  Features  used  for  classification. 


Features  Used 

feature 

set  1 

(segmentation) 

target  width 
target  height 
target  area 
height  /  width 

feature 

set  2 

(grey  scale) 

min  grey  level 
max  grey  level 
mean  grey  level 
variance  grey  level 
sigma  grey  level 

feature 

set  3 

(beta  functions) 

Maitra’s  beta  functions 
(1,2,4, 5, 6) 

2.1. 1.4.3.  Segmentation 

Hand  thresholding  was  used  to  segment  the  targets.  Only  one  segmentation  was  used  for 
all  eleven  target  images  of  any  given  orientation.  This  was  done  because  all  target  images  of 
any  orientation  were  almost  identical.  It  should  be  noted  that  because  the  targets  in  each 
orientation  were  segmented  identically,  there  were  not  enough  independent  samples  using 
the  segmentation  features  to  give  statistically  meaningful  results.  Furthermore,  in  the  exper¬ 
iment  where  the  classes  were  split  into  clusters,  no  results  for  the  segmentation  features  were 
obtained  because  the  covariance  matrices  became  singular  because  the  images  were  nearly 
identical. 
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2.1. 1.4.3.  Results 

The  upper  and  lower  bounds  for  both  experiments  are  shown  in  Table  2.1.8.  Feature  set 
one  is  not  statistically  valid  because  the  covariance  matrices  became  singular.  These  results 
show  that  FLIR  targets  are  reliably  classifiable  if  they  are  well  defined  and  are  of  a  rela¬ 
tively  large  size.  The  results  also  show  that  the  bounds  given  by  the  Parzen  error  estimation 
technique  remain  fairly  tight  when  applied  to  real  world  data.  This  experiment  also  confirms 
one  of  the  Parzen  technique’s  major  shortcomings,  its  need  for  a  large  number  of  samples. 


Table  2.1.8  Results  of  classifiability  experiments  on  simulated  FLIR. 


feature 

with  clusters 

no  clusters 

set 

upper 

lower 

upper 

lower 

1 

_ * 

0.0* 

0.0* 

2 

22.4% 

4.2% 

22.1% 

3.3% 

3 

13.6% 

0.3% 

13.9% 

0.3% 

*  not  statistically  valid 


One  surprising  result  is  found  when  comparing  the  results  of  the  non-clustered  and 
clustered  experiments.  It  was  originally  thought  that  by  splitting  the  classes  into  clusters  would 
improve  classifiability.  However,  the  results  of  this  experiment  show  that  no  improvement  is 
made  when  splitting  the  classes  into  the  front/rear  and  side  views.  Since  the  classes  were  split 
solely  on  the  hueristic  argument  that  the  statistics  of  the  classes  should  change  the  most  between 
the  two  views,  the  lack  of  improvement  of  classifiability  may  be  due  to  inappropriate  clustering. 
A  statistical  clustering  technique  may  be  used  to  verify  if  inappropriate  clustering  was  the  cause 
of  this  lack  of  improvement. 

2.1. 1.4.4.  Experiment  4  -  BRITT  Data 

The  targets  from  the  BRITT  database  consists  of  tanks,  trucks,  APCs  and  jeeps.  The  data¬ 
base  contains  images  where  the  targets  range  in  quality  from  highly  distinguishable  to  virtually 
invisible.  All  images  were  collected  at  a  range  of  either  2.5,  3.5,  or  5  kilometers  with  the  height 
of  the  FLIR  sensor  varied  from  100  feet  to  200  feet  to  give  a  constant  angle  of  declination. 
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2.I.I.4.4.  Target  Classes 

The  experiment  was  run  with  three  classes:  tanks,  trucks  and  APCs.  The  database  was 
manually  searched  and  50  of  the  most  visible  targets  from  each  class  were  selected.  Unfor¬ 
tunately  there  were  not  enough  targets  to  enable  an  entire  class  to  consist  of  a  single  type  of 
vehicle  at  a  fixed  range;  because  of  this,  two  target  classes  consisted  of  multiple  target  types  at 
varying  ranges.  Table  2.1.9  shows  the  make-up  of  the  different  classes,  and  Figures  2.1.5,  2.1.6, 
and  2. 1 .7  are  the  actual  targets. 


Table  2.1.9  Target  types  of  classes. 


Class  compositions 

Class 

Number 

Type 

Distance 

Tanks 

50 

M551 

2.5km 

APCs 

25 

Ml  13 

2.5km 

25 

Ml  14 

2.5km 

Trucks 

18 

M35 

2.5km 

18 

M35 

3.5km 

-1 

14 

M35 

5km 

2.1. 1.4.4.  Features 

The  features  used  were  identical  to  the  ones  used  on  the  simulated  FLIR  in  experiment  2. 

2. 1.1. 4.4.  Segmentation 

The  targets  were  segmented  by  hand.  The  method  used  consisted  of  enclosing  each  target 
in  a  bounding  rectangle  as  shown  in  Figures  2.1.5,  2.1.6,  and  2.1.7.  This  was  done  because  of 
the  highly  varying  grey  level  value  of  any  given  target  ruled  out  simple  thresholding  as  a  seg¬ 
mentation  technique.  It  is  hoped  that  either  a  wire  frame  segmentation  technique  or  a  segmenta¬ 
tion  method  currently  employed  by  industry  can  be  used  in  the  future. 

2.1. 1.4.4.  Results 

Table  2.1.10  reports  the  upper  and  lower  classification  error  bounds  for  the  different 
feature  sets.  The  confusion  matrices  for  the  different  feature  sets  are  given  Tables  2.1.11  - 
2.1.16.  The  results  show  that  statistical  classifier  performance  is  extremely  poor.  There  are  a 
number  of  reasons  for  this.  First,  the  targets  are  hard  to  classify;  a  human  operator  is  hard 
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Figure  2.1.5  Fifty  images  in  Tanks  class  with  bounding  rectangle  segmentation  shown. 
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Figure  2.1.6  Fifty  images  in  APCs  class  with  bounding  rectangle  segmentation  shown. 


Figure  2.1.7  Fifty  images  in  Trucks  class  with  bounding  rectangle  segmentation  shown. 
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pressed  to  classify  the  targets  in  many  cases.  Secondly,  it  is  thought  that  the  segmentation 
method  used  severely  reduced  classification  ability.  The  next  section  shows  that  because  the 
segmentation  features  and  beta  functions  are  highly  shape  dependent,  the  classification  perfor¬ 
mance  improves  when  a  more  sophisticated  segmentation  technique  is  used.  Lastly,  it  is  hard  to 
determine  classifier  performance  when  using  the  beta  function  feature  set  because  of  the  width 
of  the  error  bounds.  In  the  next  section  we  show  that  much  tighter  tx>unds  can  be  obtained  by 
using  an  advanced  Parzen  error  estimation  technique. 


Table  2.1.10  Results  of  classifiability  experiments  on  BRITT  Data. 


feature  set 

upper  bound 

lower  bound 

segmentation 

51.3% 

42.0% 

grey  scale 

46.7% 

42.7% 

beta  functions 

63.6% 

11.1% 
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Table  2.1.12  Confusion  matrix  for  resubstitution  error  of  segmentation  features  (42.0%  error). 


true 

classified  as 

class 

APCs 

tanks 

trucks 

APCs 

42 

8 

0 

tanks 

18 

32 

0 

trucks 

29 

8 

13 

Table  2.1.13  Confusion  matrix  for  leave-one-out  error  of  grey  scale  features  (33.3%  error). 


true 

classified  as 

class 

APCs 

tanks 

trucks 

APCs 

30 

15 

5 

tanks 

10 

35 

5 

trucks 

3 

12 

35 

Table  2.1.14  Confusion  matrix  for  resubstitution  error  of  grey  scale  features  (8.7%  error). 


true 

classified  as 

class 

APCs 

tanks 

trucks 

APCs 

40 

8 

2 

tanks 

1 

47 

2 

trucks 

0 

0 

50 

Table  2.1.15  Confusion  matrix  for  leave-one-out  error  of  Beta  functions  (52.7%  error). 


true 

classified  as 

class 

APCs 

, 

taiixs 

trucks 

APCs 

36 

7 

tanks 

30 

18 

2 

trucks 

20 

13 

17 

Table  2.1.16  Confusion  matrix  for  resubstitution  error  of  Beta  functions  (8.0%  error). 


true 

classified  as  i 

class 

APCs 

tanks 

trucks 

APCs 

42 

2 

6 

tanks 

0 

50 

0 

trucks 

0 

4 

46 

2.1. 1.4.5.  Experiment  5  -  BRITT  Data  -  Improved  Techniques 

The  fifth  experiment  was  run  to  determine  if  any  improvement  in  classifiability  could  be 
gained  by  using  a  sophisticated  technique  to  segment  real  world  targets.  To  enable  comparison, 
the  targets  used  were  the  same  as  those  of  experiment  4.  The  experiment  was  also  run  to  test 
the  performance  of  the  advanced  Parzen  error  estimation  scheme  on  data  with  unknown  distri¬ 
butions. 

2.1. 1.4.5.  Target  Classes 

The  experiment  was  run  with  three  classes:  tanks,  trucks  and  APCs.  The  classes  were 
composed  of  the  same  hand  picked  targets  as  experiment  4  and  are  shown  in  Table  2.1.9. 
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2.L1.4.5.  Features 

Three  feature  sets  used  were  identical  to  the  ones  used  on  the  simulated  FLIR  and  the  first 
BRITT  experiment  in  experiments  2  and  3  respectively.  However,  two  new  feature  sets  were 
also  used;  Table  2.1.17  shows  the  features  used  in  these  new  sets. 


Table  2.1.17  Features  used  fi'’-  classification. 


Features  Used 

feature 

set  4 

rectangularity 
perimeter'2  /  area 
hcight'2  /  area 
height^  *  range'2 

feature 

set  5 

normalized  contrast 
depression  angle 
area  *  ranged 
(width  /  height)~2 

2. 1.1. 4.5.  Segmentation 

The  targets  in  this  experiment  were  segmented  using  the  Hughes  segmenter  referenced  in 
Section  2. 1.1.1.  The  parameters  of  the  segmenter  used  in  this  experiment  were  set  bv  hand  to 
produce  optimal  results.  Because  the  BRITT  database  includes  target  locations,  no  uUection 
scheme  was  needed  to  locate  the  targets.  Figures  2.1.8,  2.1.9,  2.1.10  show  the  segmented  tar¬ 
gets. 

2.1.1.4.5.  Results 

Table  2.1.18  reports  the  upper  and  lower  classification  error  bounds  for  the  different 
feature  sets  using  the  original  Parzen  error  estimation  procedure.  The  results  show  the  expected 
improvement  in  classifiability  over  the  results  for  experiment  3.  In  most  cases,  the  upper  bound 
using  the  new  segmentation  techniques  is  near  the  lower  bound  produced  by  the  bounding  rec¬ 
tangle  method.  This  increase  in  performance  is  due  to  the  improved  segmentation  scheme.  The 
results  of  applying  the  advanced  Parzen  technique  to  the  data  is  shown  in  Table  2.1.19.  The 
results  show  that  the  error  bounds  have  been  tightened  in  three  of  the  five  feature  sets. 
Although  the  tightening  affect  is  not  as  drastic  as  had  been  hoped  for,  the  improvement  shown 
is  not  insignificant.  Because  of  the  improvement  in  results  shown  and  the  theoretical  argument 
presented  earlier,  the  advanced  Parzen  error  estimation  technique  will  be  used  in  all  future 


Figure  2.1.8  Fifty  images  in  Tanks  class  after  automatic  segmentation. 


Figure  2.1.9  Fifty  images  in  APCs  class  after  automatic  segmentation. 


Figure  2. 1.10  Fifty  images  in  Trucks  class  after  automatic  segmentation. 
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experiments.  Some  people  at  the  Night  Vision  Labs  have  also  expressed  an  interest  in  seeing 
the  results  from  a  parametric  error  estimation  scheme.  Table  2.1.20  shows  the  error  bound 
found  by  a  parametric  (quadratic)  error  estimator.  This  table  points  out  the  major  flaw  of 
parametric  error  estimates.  Although  they  do  not  require  as  many  samples  to  produce  results 
and  they  do  produce  valid  upper  bounds,  the  upper  bounds  that  they  produce  are  often  unneces¬ 
sarily  high  and  their  lower  bounds  are  valid  only  if  the  samples  are  drawn  from  a  distribution  of 
the  expected  form  (in  this  case  Gaussian). 


Table  2.1.18  Results  of  classifiability  experiments  on  BRITT  Data,  original  Parzen  estimate. 


feature  set 

— 

upper  bound 

lower  bound 

segmentation 

43.3% 

6.7% 

grey  scale 

34.7% 

2.7% 

beta  functions 

47.3% 

.>0.0% 

feature  set  4 

47.3% 

21.3% 

feature  set  5 

36.7% 

19.3% 

Table  2.1.19  Kesults  of  classifiability  experiments  on  BRITT  Data,  advanced  Parzen  estimate. 


feature  set 

upper  bound 

lower  bound 

segmentation 

43.3% 

10.7% 

grey  scale 

34.0% 

12.7% 

beta  functions 

47.3% 

28.0% 

feature  set  4 

48.0% 

20.7% 

feature  set  5 

37.3% 

24.0% 
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Table  2.1.20  Results  of  classifiability  experiments  on  BRITT  Data,  quadratic  estimate. 


feature  set 

upper  bound 

lower  bound 

segmentation 

53.3% 

49.3% 

grey  scale 

38.7% 

30.7% 

beta  functions 

54.7% 

52.0% 

feature  set  4 

62.0% 

60.0% 

feature  set  5 

42.0% 

38.7% 

2.I.I.5.  Future  Work 

The  preceding  experiments  demonstrate  that  we  have  the  tools  available  to  find  the 
classifiability  of  a  set  of  targets  represented  as  grey-scale  images.  Our  future  work  in  this  area 
will  be  aimed  at  using  these  techniques  to  find  the  error  bounds  with  a  larger  number  of  features 
per  sample  vector.  It  is  believed  that  experiments  with  large  sample  vectors  will  show  that 
classifier  performance  on  FLIR  targets  is  acceptable  if  the  targets  are  highly  defined  and  enough 
features  are  used  for  classification.  As  was  mentioned  previously  in  this  report,  the  techniques 
presented  here  will  require  a  minimum  of  200  samples  per  class  to  prove  our  conjecture  with 
large  feature  vectors.  Once  acceptable  results  have  been  obtained,  we  plan  to  find  the 
classifiability  of  FLIR  targets  from  data  typical  of  the  type  drawn  from  the  real  world.  Targets 
will  be  drawn  at  random  from  the  database  with  no  regard  for  target  definition  and  a  clutter 
class  will  be  introduced.  The  motivation  for  the  introduction  of  the  clutter  class  is  the  fact  that 
any  target  detection  scheme  will  allow  a  significant  number  of  false  detections  to  occur  if  the 
probability  that  a  target  is  missed  is  minimized.  We  plan  to  simulate  these  false  target  detec¬ 
tions  with  the  introduction  of  the  clutter  class.  We  hope  to  use  this  final  experiment  to  prove 
our  conjecture  that  there  is  not  enough  information  in  a  single  FLIR  frame  to  accurately  classify 
a  random  sampling  of  typical  FLIR  targets  and  clutter. 

2.1.2.  Algorithm  and  Image  Metrics 

Image  metrics  are  supposed  to  provide  us  with  an  independent  set  of  variables  for  image 
and  algorithm  characterization.  The  approach  is  to  partition  all  ATRs  (automatic  target  recog¬ 
nizers)  into  three  functional  areas:  detection,  segmentation,  and  classification,  as  shown  in  Fig¬ 
ure  2.1.11.  The  image  metrics  are  then  used  as  sets  of  independent  variables  which  will 
independently  characterize  each  of  the  functional  areas. 


(Conditioned)  Object  Defined  Classified 

Image  Coordinates  Object  Regions  Targets 


Figure  2.1.1 1  ERIM’s  Generic  ATR  Process. 
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Section  2. 1.2.1  discusses  some  of  the  metrics  proposed  by  ERIM  and  Section  2. 1.2.2  pro¬ 
posed  a  new  segmentation  metric  for  threshold  based  segmenters. 

2. 1.2.1.  Evaluation  of  ERIM’s  Metrics 

ERIM  has  proposed  many  metrics  for  image  and  algorithm  characterization.  The  follow¬ 
ing  sections  examine  some  of  these  metrics  and  show  that  there  are  many  times  when  these 
metric  give  meaningless  results.  The  next  subsection  discusses  the  characterization  of  target 
detection  and  gives  examples  of  where  images  with  the  same  TIR2,s  (and  therefore  the  same 
complexity)  have  very  different  P/s,  showing  that  there  are  cases  where  TIR2  does  not  meas¬ 
ure  complexity  with  respect  to  the  probability  of  detection. 

Section  2. 1.2. 1.2  presents  the  results  of  our  study  of  TBIR 2  as  a  metric  for  segmentation. 
We  show  that  images  with  the  same  TBIR  2  can  have  different  segmentation  accuracies.  There¬ 
fore  image  complexity  with  regard  to  segmentation  cannot  be  measured  by  th>s  metric. 

2. 1.2.1. 1.  Characterization  of  Target  Detection 

According  to  ERIM  formulation,  the  performance  of  a  detection  algorithm  can  be 
evaluated  by  plotting  Pd  against  the  following  “independent"  variables: 

1 .  Target-Interference  Ratio  Squared  ( TIR  2 ) 

2.  Resolution  Cells  on  Object  (RNq) 

3.  Expected  Resolution  Cells  on  Object  (RE 0) 

4.  Edge  Strength  Ratio  (ESR) 

Our  basic  criticism  of  this  characterization  methodology  is  based  on  the  conviction: 

It  can  be  misleading  to  examine  the  performance  of  an  algorithm  against  a  col¬ 
lection  of  variables  individually,  particularly  when  it  can  be  shown  from  ele¬ 
mentary  theoretical  analysis  that  the  performance  might  in  fact  be  dependent 
upon  some  combination  of  these  variables. 

What  we  are  trying  to  say  is  that  when  detection  algorithms  are  theoretically  analyzed,  one  can 
demonstrate  that  the  Pd  must  be  a  function  of  the  product  of  the  average  contrast  difference 
(between  the  target  and  background)  and  the  effective  size  of  the  target,  which  is  given  by 
RNq.  Therefore,  examining  separately  the  dependence  of  Pd  on  TIR  and  RNq  can  be 

w - 

This  statement  is  based  on  the  following  elementary  result  from  the  classical  decision  theory 
[Trecs68],  Suppose  we  want  to  detect  a  deterministic  signal  s(t)  that  is  corrupted  by  additive 
random  noise  n(t);  the  observed  signal  being  denoted  by  r(t).  Optimum  detection  is  obtained  by 
conducting  the  following  likelihood  ratio  test 

2^e\  E 

— — J r(t)s(t)dt  ><  Inri  +  — — 

Wo  i  H o  N o 
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misleading. 

Put  another  way,  suppose  we  confine  our  attention  to  performance  as  measured  by  P4  vs 
TIR2.  The  probability  of  detection,  P4,  is  the  probability  that  an  actual  object  of  interest  is 
detected  at  least  once,  measured  over  all  the  actual  objects  of  interest.  TIR  2  is  defined  as: 

<Sb 

It  is  possible  to  construct  two  different  examples  of  targets  with  identical  TIR 2  measures  but 
responding  very  differently  to  the  process  of  detection.  Suppose  we  have  an  image  with  the 
grey  scale  values  of  the  background  and  the  target  pixels  distributed  as  show  in  Figure  2.1.12. 
TIR  *  would  predict  a  certain  level  of  complexity  for  that  image.  Suppose  we  had  a  second 
image  in  which  Ob  and  o0  were  the  same  as  the  first  image,  but  the  difference  in  the  grey  level 
of  the  background  and  the  target  was  greater  (as  shown  on  the  bottom  of  Figure  2.1.12).  TIR2 
would  predict  that  the  second  image  is  less  complex.  This  is  a  correct  prediction  because  there 
is  less  overlap  between  the  distributions,  and  therefore  less  possibility  for  error. 

Figure  2.1.13  shows  a  similar  example,  this  time  the  difference  in  means  is  the  same,  but 
Ob  has  decreased.  TIR2  again  predicts  a  less  complex  image,  which  is  correct  since  there  is  less 
overlap. 

The  major  flaw  in  the  TIR  2  measure  is  that  Oa,  the  variance  of  the  object,  does  not  appear 
in  the  TIR2  equation.  Therefore  all  the  distributions  in  Figure  2.1.14  have  the  same  TIR2. 
However  the  complexity  will  be  very  different  for  each  because  the  overlap  in  the  grey  scale 
values  varies  greatly  from  one  distribution  to  another. 

That  this  is  indeed  so  was  verified  by  the  following  experiments,  one  involving  uncorre¬ 
lated  additive  noise  and  the  other  correlated  noise;  the  latter  we  believe  is  more  representative 
of  what  happens  in  practice.  First  we  show  the  case  of  uncorrelated  noise. 

In  Figure  2.1.15,  we  ha  c  shown  a  sequence  of  synthesized  target  images.  These  images 
are  constructed  by  first  adding  an  elliptical  “target"  to  a  uniform  background  and  then  adding 
uncorrelated  random  Gaussian  random  noise  to  the  composite. 

The  following  procedure,  which  we  believe  is  close  to  what  the  NVL  contractors  are  using, 
was  implemented  for  target  detection  in  these  images.  We  first  construct  a  histogram  of  the 
brightness  values,  the  value  corresponding  to  the  most  prominent  valley  in  the  histogram  is  used 
for  segmentation  by  thresholding.  The  segmented  output  is  then  integrated  and  compared 
against  80  percent  of  the  expected  area  of  the  actual  target  to  determine  whether  the  target  is 

where  H 1  is  the  hypothesis  that  the  signal  is  present  and  H  q  that  it  is  absent.  E  is  the  total  energy 
in  the  signal,  N q  the  noise  variance,  and  T  the  total  observation  time.  For  our  application,  the  left 
hand  side  would  be  proportional  to  the  product  of  TIR  2  and  RN  0  for  simple  constant  gray  scale 
targets.  Similar  conclusions  can  be  drawn  from  the  more  advanced  detection  theory  in  [Trecs71  j 
where  the  signal  s(t)  is  allowed  to  be  a  random  process. 


pixel 

intensity 


,2 


count 


pixel 

count 


Figure  2. 1 . 1 3  Change  in  T1R  2 
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found.  In  the  middle  column  of  Figure  2. 1.15,  we  have  shown  the  histograms  for  the  gray  lev¬ 
els,  in  the  right  column  are  illustrated  segmentation  outputs.  As  l  evident  from  the  figure,  the 
detection  process  completely  fails  for  some  of  the  targets,  although  they  ail  have  the  same 
TIR2  measures. 

We  will  now  demonstrate  similar  results  with  correlated  noise.  First  a  few  words  about 
how  noise  patterns  with  controlled  correlations  were  generated.  Suppose  v(x,y)  represents  an 
array  of  uncorrelated  random  numbers,  which  are  easily  generated  by  a  standard  random 
number  generator.  If  we  now  construct  a  new  array  by  using  the  following  recursion 

y(x,y)  =  <ry(;t-l,y)  +  by(x,y- 1)  +  v(x,y)  -  aby(x-\yy-\) 

we  can  show  [RosKak82]  that  the  resulting  noise  pattern  has  the  following  correlation  function 

R(i,j)  =  e~atn~blj{ 

This  assumes  that  the  noise  pattern  is  zero-mean  and  of  unit  variance.  This  pattern  can  be  mul¬ 
tiplied  by  an  appropriate  scaling  factor  to  obtain  a  desired  signal-to-noise  ratio  when  the  pattern 
is  added  to  the  synthesized  target  image.  In  the  experiments  reported  here,  we  have  used 
a=b=.4 . 

In  Figure  2.1.16,  we  have  shown  results  for  the  case  of  correlated  noise  that  are  similar  to 
those  in  Figure  2.1.15. 

2. 1.2. 1.2.  Characterization  of  Target  Segmentation 

The  next  step  in  the  ATR  process,  after  a  target  is  detected,  is  to  segment  the  target.  ERIM 
proposes  characterizing  segmentation  with  the  following  independent  variables. 

1 .  Resolution  Cells  on  Object  (RNq) 

2.  Expected  Resolution  Cells  on  Object  ( REq ) 

3.  Target-Background  Interference  Ratio  Squared  ( TBIR 2) 

4.  Edge  Strength  Ratio  (ESR) 

According  to  ERIM,  the  performance  of  a  segmentation  algorithm  can  be  measured  by  plotting 
segmentation  accuracy,  As,  versus  these  variables.  The  segmentation  accuracy  is  defined  to  be 
the  ratio  of  two  factors:  the  first  factor  is  equal  to  the  intersection  of  the  segmented  region  for  an 
object  and  the  true  image  region  corresponding  to  that  object;  and  the  second  factor  is  the  union 
of  these  two  regions.  The  basic  idea  in  the  ERIM  methodology  can  be  summed  up  as  follows: 
Let’s  say  we  have  two  target  images  /  \  and  li,  with  the  value  of  TBIR2-  for  1 2  larger  than  what 
* - 

If  approximate  range  to  the  target  is  known,  which  means  if  the  size  of  the  target  one  is  looking 
for  is  approximately  known,  it  is  unlikely  that  one  would  accept  V  than  about  80  percent  of  that 
area  from  the  segmenter  in  order  to  declare  the  target  as  present 
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it  is  for  I Then,  according  to  ERIM,  the  segmentation  accuracy  for  1 2  must  be  poorer  than  it 
would  be  for  / 1 .  Equivalently,  if  we  have  two  images  of  the  same  TBIR  2  value,  then  it  should 
be  possible  to  segment  them  both  with  the  same  accuracy. 

In  this  report,  we  will  show  it  is  possible  to  easily  construct  simulated  examples  of  two  dif¬ 
ferent  target  images  of  the  same  TBIR  2  value,  these  two  images  yielding  very  different  segmen¬ 
tation  accuracies.  This  was  done  by  taking  a  theoretical  look  at  TBIR  2  to  see  where  it  might 
have  problems,  and  then  construct  examples  which  exploit  the  problems. 

TBIR2  is  defined  as: 

_  _  2 
,  (x0-xb) 

TBIR 2  =  -  ----- 

Take  for  example  the  histogram  in  Figure  2.1.17a,  which  shows  a  distribution  for  the  back- 
ground  and  the  object.  The  TBIR  measure  would  predict  that  an  image  with  such  a  distribu¬ 
tion  would  be  as  complex  as  another  image  with  a  smaller  difference  in  means  {x0-xb)  if: 

1 .  The  object  had  a  smaller  variance  as  shown  in  Figure  2.1.17b, 

2.  The  background  had  a  smaller  variance  as  shown  in  Figure  2. 1 . 17c,  or 

3.  Both  the  object  and  the  backgrounds  has  smaller  variances  as  shown  in  Figure  2.1. 17d. 

This  seems  like  a  reasonable  relationship  to  have  for  Gaussian  distributions,  as  illustrated  in 
Figure  2.1.17,  because  the  changes  always  result  in  distribution  in  which  a  given  pixel  is  as 
likely  to  be  correctly  associated  with  the  proper  class  (i.e.  background  or  object). 

One  problem  with  using  distributions  to  characterize  an  image  is  that  spatial  information  is 
lost.  Many  segmentation  errors  are  caused  by  targets  which  do  not  have  uniform  intensities. 
The  pixels  in  the  target  whose  grey  scale  values  are  close  to  the  mean  of  the  background  can 
cause  the  segmenter  to  split  the  target  into  two  or  more  segments. 

To  illustrate  this  point  consider  an  synthetic  elliptical  target  with  a  non-uniform  intensity. 
Figure  2.1.18  show  a  3D  plot  of  such  a  target,  plotting  the  grey  value  on  the  z-axis.  Figure 
2.1.19  shows  four  noisy  targets  which  were  created  by  adding  uncorrelated  Gaussian  noise  to 
the  target  and  the  background  so  that  the  images  have  the  same  TBIR  2 .  The  noise  statistics  for 
the  targets  are  shown  in  Table  2.1.21.  The  segmentation  procedure  used  is  the  same  as  used  in 
Figure  2.1.15  of  Section  2.1. 2.1. 

The  thresholded  images  show  that  for  x0-xb  equal  to  20  and  30,  the  image  is  split  into  two 
segments.  For  x0-xb  equal  to  40  and  50,  the  image  is  left  intact.  Figure  2.1.19  therefore  shows 
that  two  images  with  the  same  TBIR  2  can  have  very  different  segmentation  accuracies. 


Noisy  Image 


Histogram 


Thresholded 

Image 


cies.  See  Table  2.1.21  for  target  and  background  statistics. 


Table  2.1.21  Target  and  background  statistics  for  Figure  2.1.19. 
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2. 1.2. 1.3.  Conclusions 

The  conclusions  to  be  drawn  from  these  studies  is  that  simple  metrics  such  as  TIR  and 
TBIR 2  may  measure  the  complexity  of  an  image  for  the  purpose  of  detection  or  segmentation 
some  of  the  time.  However,  it  is  easy  to  construct  examples  which  show  they  completelv  fail 
to  measure  image  complexity.  Therefore  there  are  many  cases  where  TIR  and  TBi  say 
nothing  at  all  about  the  process  of  detection  or  segmentation. 

2.I.2.2.  Preliminary  Work  in  the  Formulation  of  New  Metrics 

Metrics  have  been  devised  for  the  systematic  characterization  of  the  algorithms  in  the  ATR 
process.  The  approach  to  characterization  has  been  to  divide  the  ATR  process  into  three  steps, 
detection,  segmentation,  and  classification.  Although  different  metrics  for  each  step  have  be 
proposed,  the  previous  sections  have  shown  that  these  metrics  don’t  work  for  all  images.  We 
propose  that  each  step  of  the  ATR  process  should  be  broken  down  into  less  general  algorithms 
and  metrics  be  devised  for  each  of  the  algorithms.  Experience  gained  in  designing  metrics  for 
the  less  general  algorithms  could  then  provide  insight  on  how  to  design  a  metric  for  the  more 
general  algorithms  (if  such  a  metric  can  be  found).  The  following  subsection  proposes  a  new 
metric  for  characterizing  a  threshold  based  segmenter. 

2.I.2.2.I.  A  Metric  for  Characterizing  Threshold  Based  Segmenters 

The  most  critical  step  in  a  threshold  based  segmenter  is  the  selection  of  the  threshold  value 
(or  values  if  it  is  a  more  sophisticated  segmenter).  Our  first  attempt  at  designing  a  threshold 
based  segmenter  assumes  that  the  complexity  of  an  image  (with  respect  to  segmentation)  is 
related  to  the  number  of  threshold  values  which  will  give  a  certain  segmentation  accuracy  . 
Some  images  are  easy  to  segment  with  a  threshold  based  segmenter  because  there  are  many 
threshold  values  which  will  give  a  good  segmentation  accuracy.  Other  images  will  have  few,  if 
any,  threshold  values  which  will  give  good  accuracy  and  are  therefore  difficult  to  segment.  The 
segmentation  threshold  (ST)  metric  proposed  here  measures  the  segmentation  complexity  of  an 
image  with  respect  to  a  threshold  based  segmenter  by  measuring  the  number  of  threshold  values 
which  will  result  in  a  certain  segmentation  accuracy. 

The  ST  metric  works  like  this:  The  image  to  be  characterized  is  thresholded  at  a  value  x 
by  assigning  all  pixels  below  x  to  the  background  and  all  pixels  equal  to  and  above  x  to  the  tar¬ 
get.  The  largest  region  is  found  and  all  other  regions  are  deleted.  (A  region  is  defined  to  consist 
of  a  group  of  pixels  that  are  four-connected  to  each  other.)  The  segmentation  accuracy  04$)  of 
the  remaining  region  is  computed  relative  to  a  ground  truth  image.  The  above  process  is 
repeated  for  all  possible  threshold  values.  Finally  As  vs.  the  threshold  value  is  plotted.  Figure 

This  of  course  assumes  that  the  images  being  characterized  have  the  same  number  of 
quantization  levels.  For  now,  this  is  a  valid  assumption  since  the  data  we  have  been  given  have  all 
used  eight  bits  per  pixel. 
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2.1.20  shows  the  As  vs.  threshold  plots  in  the  senter  column  for  the  same  images  as  in  Figure 
2.1.19.  The  x-axis  is  the  value  of  the  threshold  and  the  y-axis  is  the  segmentation  accuracy. 
The  wider  the  peak,  the  more  thresholds  which  will  give  a  certain  accuracy.  We  can  see  that  the 
(x0-xb)  =  20  target  is  difficult  to  segment  using  a  threshold  based  segmenter  because  there  is 
only  one  threshold  value  that  will  give  a  segmentation  accuracy  greater  than  70%.  The 
(x0-xb)  -  30  target  is  easier  to  segment  because  there  are  several  threshold  values  which  will 
give  greater  than  85%  segmentation  accuracy.  The  remaining  two  images  are  increasingly  easy 
to  segment  because  as  the  plots  show,  there  are  more  values  that  will  give  a  greater  than  85% 
segmentation  accuracy.  The  plots  in  the  right  column  of  Figure  2.1.20  show  the  number  of 
thresholds  vs.  As-  The  plots  show  that  there  are: 

1.  very  few  values  (only  1)  for  the  ( x0-xb )  =  20  image, 

2.  =15  values  for  the  (x0-xb)  =  30  image, 

3.  =20  values  for  the  (x0-xb)  =  40  image,  and 

4.  greater  than  30  values  for  the  ( xQ~xb )  =  50  image, 

which  give  a  segmentation  accuracy  of  greater  than  75%.  Therefore  the  images  increasingly 
easier  to  segment  (using  a  threshold  based  segmenter)  as  you  move  from  the  top  image  to  the 
bottom  image  of  Figure  2.1.20. 

2.1. 2,2.2.  Conclusions 

The  ST  metric  correctly  predicted  which  of  the  synthesized  images  are  more  difficult  to 
segment.  These  were  the  same  images  that  fooled  the  TBIR 2  measure.  The  major  flaw  with 
this  method  is  that  it  tells  us  how  many  thresholds  will  give  us  a  certain  segmentation  accuracy, 
but  it  won’t  tell  us  how  hard  it  is  to  find  those  thresholds.  For  example  the  ST  metric  will  be 
fooled  if  there  is  no  overlap  in  the  pixel  values  for  the  foreground  and  the  background.  In  such 
a  case  it  is  very  easy  to  pick  a  threshold.  For  example,  suppose  all  the  background  pixels  for  a 
given  image  have  the  value  10,  and  all  the  target  values  are  200,  then  any  threshold  between  10 
and  200  would  give  perfect  segmentation.  Consider  a  second  image  with  the  same  background 
values  and  the  target  values  equal  to  20.  It  is  as  easy  to  segment  as  the  previous  image,  how¬ 
ever  our  measure  would  say  that  it  is  harder  to  segment  because  there  are  fewer  thresholds  that 
will  give  the  proper  segmentation. 

Although  the  ST  metric  is  based  a  single  threshold  value,  it  works  well  for  the  synthesized 
images.  Future  work  will  include: 

1 .  testing  the  ST  metric  on  real  FLIR  images  to  see  how  well  it  measures  segmentability,  and 

2.  looking  into  making  it  a  more  robust  measure  by  basing  it  on  a  Bayes  Classifier  for 
minimal  error.  Such  an  approach  views  segmentation  as  a  problem  of  classifying  each 
pixel  as  either  background  or  target.  This  makes  it  possible  to  find  optimal  thresholds. 
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2.2.  TWO  FLIR  SEGMENTERS 

Although  most  of  the  effort  this  time  has  been  in  processing  LADAR  data,  work  is  still 
continuing  in  FLIR  processing.  We  believe  that  one  must  be  able  to  perform  reasonable  low 
level  processing  on  a  given  sensor  before  trying  to  fuse  it  with  other  sensors.  This  section 
presents  our  experiences  with  using  two  different  segmenters  (edge  guided  threshold,  and  tree 
traversal)  on  FLIR  images.  These  experience  should  prove  to  be  useful  when  fusing  FLIR  with 
LADAR. 

2.2.1.  An  EGT  Based  Segmenter  for  FLIR  Data 

A  tunable  edge-guided  threshold- based  (EGT)  segmenter  was  developed  for  the  automatic 
extraction  of  target  silhouettes  from  FLIR  images.  The  segmenter  was  designed  to  be  part  of  an 
intelligent  target  recognizer.  It  is  “tunable”  in  that  one  may  specify  the  expected  brightness 
range  of  the  targets  and  the  expected  target  areas.  The  expected  brightness  is  specified  as  a  per¬ 
centage  range.  For  example,  from  time  of  day  and  weather  information,  and  possibly 
knowledge  of  how  active  the  targets  have  been,  we  might  expect  the  targets  to  consist  of  the 
hottest  10%  to  20%  of  the  pixels  in  the  image.  This  percentage  estimate  also  depends  on  the 
expected  size  of  the  targets  relative  to  the  image  size.  The  expected  target  area  (in  pixels)  is 
also  specified  as  a  range  of  values,  and  may  be  calculated  from  range  information.  Once  poten¬ 
tial  target  regions  have  been  extracted,  the  segmentation  results  may  be  evaluated  by  an  expert 
system  and  the  segmenter  called  again  with  new  parameters. 

2.2.1. 1.  The  Algorithm 

The  first  step  in  the  segmentation  process  is  to  perform  edge  detection  on  the  original  gray 
scale  FLIR  image.  The  edge  detection  process  produces  candidate  object  edge  pixels  by  adap¬ 
tively  thresholding  the  Sobel  edge  magnitude  image.  The  original  image  is  convolved  with  the 
horizontal  and  vertical  Sobel  edge  masks  to  produce  the  horizontal  and  vertical  edge  gradients 
(H  and  V),  as  illustrated  in  Figure  2.2.1.  The  absolute  values  of  H  and  V  are  then  used  to  calcu¬ 
late  the  edge  magnitude  M.  The  mean  value  of  M  over  the  edge  magnitude  image  is  computed, 
and  an  edge  threshold  of  2.25*Mean  applied  to  M  produces  the  final  edge  image. 

Once  edge  pixels  have  been  found,  histograms  of  the  gray  scale  values  of  the  edge  and 
nonedge  (object)  pixels  in  the  original  FLIR  image  are  computed  separately  for  the  purpose  of 
selecting  a  threshold  to  use  to  segment  the  image.  This  is  preferable  to  computing  the  histo¬ 
gram  of  the  entire  original  image  because  edge  pixels  typically  have  gray  values  between  those 
of  the  regions  that  they  separate,  and  so  they  tend  to  make  the  modes  of  the  regions  blend 
together  by  filling  in  the  valley  between  them.  This  makes  threshold  selection  more  difficult. 
Computing  separate  histograms  for  object  and  edge  pixels  should  result  in  more  distinct  (non¬ 
overlapping)  modes  in  the  object  histogram,  and  peaks  in  the  edge  histogram  corresponding  to 
the  valleys  between  these  modes  (see  Figure  2.2.2).  The  gray  values  at  which  these  resulting 
peaks  in  the  edge  pixel  histogram  or  deeper  valleys  in  the  object  pixel  histogram  occur  are  good 


Figure  2.2.2  Illustration  of  effect  of  histogramrning  object  and  edge  pixels  separately 
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candidates  for  thresholds  for  segmenting  the  image. 

The  image  threshold  is  then  selected  as  follows.  Suppose  the  lower  and  upper  brightness 
percentage  limits  are  LPCT  and  UPCT  (e.g.  to  specify  that  a  target  should  consist  of  the  bright¬ 
est  10%  to  20%  of  the  image  pixels,  LPCT  =0.10  and  UPCT  =0.20).  Upper  and  lower  pixel 
count  limits  are  then  calculated  as  LOWER=LPCT*X*Y  and  UPPER=UPCT*X*Y ,  where  X 
and  Y  are  the  image  dimensions.  One  then  starts  at  the  rightmost  object  histogram  bin 
(corresponding  to  the  brightest  gray  values)  and  works  towards  the  left,  keeping  a  running  sum 
of  the  number  of  pixels  in  each  bin.  The  bins  for  which  the  running  sum  first  exceeds  LOWER 
and  UPPER  are  marked,  thereby  specifying  the  interval  of  interest  in  the  histogram  in  which 
we  expect  to  find  a  good  threshold.  We  select  as  our  threshold  the  gray  value  of  the  bin  with  the 
minimum  value  in  this  interval.  This  value  should  correspond  to  the  deepest  valley  in  the  object 
histogram  in  the  interval  of  interest. 

The  threshold  is  then  applied  to  the  original  image,  with  pixels  whose  value  is  less  than  the 
threshold  being  set  to  zero  and  the  others  retaining  their  original  value.  A  median  filter  may  be 
applied  to  the  result  in  order  to  reduce  some  of  the  border  noise  and  holes  usually  accompany¬ 
ing  a  threshold  segmentation.  Whether  or  not  this  filter  is  applied  depends  on  the  expected  tar¬ 
get  size.  The  filter  would  not  be  applied  if  the  target  was  small  enough  to  become  disconnected 
by  it. 

Finally,  connected  component  labeling  is  performed  and  component  areas  are  calculated. 
The  target  area  limits  are  AMIN  and  AM  AX,  and  are  specified  at  the  same  time  as  LPCT  and 
UPCT.  Note  that  these  target  area  limits  have  nothing  to  with  LOWER  and  UPPER,  which 
were  used  to  find  a  proper  image  threshold.  Only  those  components  within  the  specified  area 
limits  are  kept  as  possible  target  silhouettes.  Since  gray  level  information  was  kept  when  gen¬ 
erating  the  silhouettes,  the  label  and  final  silhouette  images  are  the  only  ones  necessary  for  cal¬ 
culating  both  binary  and  gray-level  features. 

2.2.I.2.  Results 

Here  we  present  results  obtained  by  running  the  EGT  segmenter  on  both  simulated  and 
actual  FLIR  images.  Figure  2.2.3  shows  the  histograms  for  the  entire  image,  the  edge  pixels 
only,  and  the  nonedge  (object)  pixels  only  for  a  tank  from  our  Eglin  Turntable  data  set  (file 
eglinlS ).  Note  that  histograming  the  object  and  edge  pixels  separately  helped  bring  out  a  small 
valley  in  the  object  histogram  near  the  bin  corresponding  roughly  to  pixels  with  gray  value  96. 
The  brightness  range  specified  was  10-20%,  which  was  found  to  correspond  to  the  gray  level 
range  84-95,  and  the  area  range  was  500-2500  pixels.  As  indicated,  a  threshold  of  94  was 
chosen  automatically.  Figure  2.2.4  shows  the  original  FLIR  image,  and  intermediate  and  final 
segmentation  results. 

A  simulated  FLIR  image  of  a  tank  was  obtained  by  digitizing  a  blurry  image  of  a  dark  tank 
model  on  a  light  background,  resampling  it  from  512  by  480  down  to  128  by  120,  and  inverting 
it.  The  EGT  segmenter  was  run  on  this  image  with  brightness  range  10-20%  and  area  range 


Figure  2.2.3  Histograms  for  FLIR  image  from  hgliri  I  urntable  data  set. 
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2.2.4  Original  FUR  image  from  Fgliri  Turntable  data  set  and  processing  results 
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500-2500.  The  brightness  range  was  found  to  correspond  to  gray  level  value  range  148-199, 
and  a  threshold  value  of  182  was  chosen  automatically.  Figure  2.2.5  demonstrates  the  deepen¬ 
ing  of  the  valley  between  the  modes  of  the  object  histogram  due  to  histograming  the  edge  pixels 
separately,  and  Figure  2.2.6  contains  the  original,  intermediate,  and  final  result  images. 

2.2.1.3.  Comparison  with  Hughes  Segmenter 

In  previous  reports  we  showed  composite  images  of  50  APC’s,  50  tanks,  and  50  trucks 
used  for  an  interclass  separation  experiment.  These  targets  were  hand  extracted  by  first  enclos¬ 
ing  the  targets  within  tight  rectangular  windows  and  then  within  each  window  using  the  com¬ 
plex  likelihood  segmenter  described  in  the  Bandwidth  Reduction  and  Intelligent  Target  Track¬ 
ing  (BRITT)  Phase  One  Final  Report  by  Hughes  Aircraft  Company’s  Electro-Optical  &  Data 
Systems  Group  [Hughes84],  The  150  target  images  used  were  from  the  BRITT  data  set. 

We  also  ran  the  EGT  segmenter  on  this  data  set.  The  brightness  range  specified  was  0- 
10%  and  the  area  range  was  100-2500  pixels.  The  same  limits  were  used  for  all  150  images 
(the  segmenter  was  NOT  tuned  for  each  image).  Even  with  this  disadvantage  (remember  the 
Hughes  segmenter  was  provided  windows  around  the  targets  via  human  input  which  guarantee 
the  target  to  be  in  the  center  of  the  window  and  at  least  a  five  pixel  border  between  the  target 
and  the  window),  our  simple  segmenter  produced  results  comparable  to  those  achieved  by 
Hughes.  Figures  2.2.7,  2.2.10,  and  2.2.13  are  composites  of  the  original  FLIR  images,  Figures 
2.2.8,  2.2.11,  and  2.2.14  are  Hughes  segmentation  results,  and  Figures  2.2.9,  2.2.12,  and  2.2.15 
are  EGT  results.  Because  the  same  limits  were  used  for  the  entire  data  set  (which  contained  tar¬ 
gets  at  several  ranges)  sometimes  the  EGT  silhouettes  were  slightly  larger  or  smaller  than  they 
should  be.  Also,  some  of  the  target  images  with  low  target-to-background  contrast  were  broken 
up  into  several  components  because  a  median  filter  was  applied  to  clean  up  some  of  the 
threshold-produced  noise  and  make  the  silhouette  borders  appear  better.  If  a  component  of  such 
a  broken  up  target  was  below  the  lower  area  limit,  it  was  classified  as  an  invalid  target  region 
and  thrown  away.  These  were  the  two  chief  causes  of  the  few  poor  EGT  segmentation  results. 

2.2. 1. 4.  Summary 

We  have  presented  an  edge  guided  threshold  based  segmenter  for  FLIR  images.  Although 
it  is  much  simpler  than  the  segmenter  used  by  Hughes,  its  performance  is  comparable  to  that  of 
the  Hughes’  segmenter,  at  least  for  the  images  that  both  were  tested  on. 

One  of  the  notable  features  of  our  segmenter  is  that  it  is  “tunable”,  that  is,  one  can  specify 
the  expected  brightness  (or  darkness)  range  of  the  target.  This  tunability  will  play  an  important 
role  in  our  production  system  based  approach  to  fusing  LADAR  and  FLIR  d;  ta[AndKak87],  As 
data  is  gathered  from  both  sensors,  hypotheses  will  be  made  as  to  what  the  target  will  look  like. 
The  FLIR  segmenter  can  then  be  tuned  to  try  to  accurately  segment  the  target  and  then  the 
hypotheses  can  be  verified. 


Figure  2.2.5  Histograms  for  simulated  1'  FIR  image  of  a  tank. 


Figure  2.2.6  Simulated  1  UK  image  of  a  tank  and  processing  results 
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Figure  2.2.8  Composite  of  the  APC  segmentation  results  from  the  Hughes  likelihood 
segmenter. 
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Figure  2.2.9  Composite  of  the  APC  segmentation  results  from  the  edge-guided  thres¬ 
hold  segrnenter. 
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Figure  2.2.10  Composite  of  f>0  original  FL1R  images  of  tanks  from  the  BRITT  data  set. 


Figure  2.2.11  Composite  of  the  tank  segmentation  results  from  the  Hughes  likelihood 
segine.n  ter. 
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Fip*re  2.2.12  ( omposite  of  the  tank  segmentation  results  from  the  edge-guided  thres¬ 

hold  segrnenter. 
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Figure  2.2.13  (composite  of  50  original  FLIR  images  of  trucks  from  the  BRITT  data  set. 
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Figure  2.2.14  Composite  of  tiie  truck  segmentation  results  from  the  Hughes  likelihood 
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2.2.2.  FLIR  Segmentation  by  Tree  Traversal 

This  section  presents  the  tree  traversal  segmentation  algorithm  in  [HoPa76],  and  evaluates 
its  merits  as  a  FLIR  segmenter.  Although  only  FLIR  data  is  used  here,  the  algorithm  is  general 
enough  to  apply  to  many  different  types  of  images  including  LADAR  by  merely  changing  the 
segmentation  criteria. 

The  following  is  a  discussion  of  the  algorithm  and  some  FLIR  segmentation  results.  The 
implementation  of  the  algorithm  presented  here  uses  criteria  suitable  for  segmentation  of 
reflectance  or  FLIR  imagery.  Although  not  pursued  yet,  it  may  be  possible  to  "tune"  portions  of 
the  implementation  to  take  advantage  of  some  special  FLIR  characteristics.  This  is  a  topic  for 
future  work. 

2.2.2.1.  The  Algorithm 

The  algorithm  consists  of  three  main  steps: 

(1)  Split  and  merge  using  a  tree  representation  of  the  image. 

(2)  Merging  based  on  adjacency  in  the  image  plane. 

(3)  Further  merging  of  image  plane  regions  based  on  some  "nearest  neighbor"  criterion. 

In  steps  one  and  two  the  difference  between  minimum  and  maximum  gray  level  is  the  cri¬ 
terion  used  on  the  FLIR  data.  In  step  one,  if  the  difference  between  the  maximum  and 
minimum  gray  levels  of  a  region  is  too  great,  the  region  is  split.  For  both  steps  one  and  two, 
regions  may  be  merged  provided  the  difference  between  the  maximum  and  minimum  gray  lev¬ 
els  of  the  resulting  region  is  small  enough. 

The  "nearest  neighbor"  criterion  of  step  three  is  average  region  gray  level.  Two  adjacent 
regions  can  only  be  merged  if  the  difference  between  their  respective  average  gray  levels  is 
small  enough.  The  following  sections  discuss  each  step  in  more  detail. 

2.2.2.1.1.  Split  and  Merge  Using  a  Tree  Representation 

In  this  portion  of  the  algorithm  all  operations  on  segments  of  the  image  are  performed 
within  the  confines  of  a  tree  structure.  Thus  in  order  to  begin,  the  tree  structure  must  be  initial¬ 
ized  to  represent  the  necessary  information  about  the  image.  This  means  that  an  initial  segmen¬ 
tation  of  the  image  which  assigns  each  pixel  of  the  image  plane  to  a  node  of  the  tree  (see  Figure 
2.2.16)  must  be  chosen.  This  actually  corresponds  to  selecting  a  starting  level  within  the  tree 
structure.  Each  node  of  this  level  contains:  the  (x,y)  position  of  the  block  it  represents,  the 
number  of  pixels  to  a  side  of  this  block,  and  the  largest  and  smallest  gray  level  within  the  block. 

Following  this  initial  segmentation  of  the  image  plane,  each  node  representing  a  square 
block  of  the  image  is  examined  to  determine  whether  the  block  should  be  broken  into  tour 
smaller  blocks  based  on  the  min/max  criterion  described  above.  Using  this  same  criterion,  four 
blocks  whose  nodes  share  a  common  parent  node  are  examined  to  determine  if  they  should  be 
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merged  into  one  larger  block. 

Note  that  a  block  which  is  created  by  a  split  operation  cannot  be  the  object  of  a  merge 
operation  within  the  same  tree  structure.  Nor  can  a  block  that  is  the  result  of  a  merge  operation 
be  subject  to  a  split.  Therefore,  only  one  pass  is  necessary,  and  the  nodes  forming  the  final 
cutset  of  the  tree  structure  represent  the  result  of  the  "split  and  merge"  portion  of  the  algorithm. 
See  Figure  2.2.17  for  an  example  of  the  results  of  this  step. 

An  important  effect  of  the  tree  structure  representation  is  the  limitation  that  it  imposes  on 
the  merge  operation.  Note  that  only  nodes  sharing  a  common  immediate  parent  are  considered 
for  merging.  Because  of  this,  the  next  step  of  the  algorithm  departs  from  the  tree  representation 
and  again  applies  the  min/max  criterion  to  determine  if  blocks  of  the  final  cutset  should  be 
merged  into  larger,  not  necessarily  square,  regions. 

2.2.2. 1.2.  Grouping  of  Final  Cutset  Segments 

After  completion  of  the  previous  step  the  image  consists  of  square  segments  that  range  in 
size  from  single  pixels,  to  the  entire  image.  This  is  the  final  cutset  mentioned  above.  Each  of 
these  segments  is  now  examined  to  determine  if  it  can  be  merged  with  the  blocks  adjacent  to  it 
in  the  image  plane. 

The  present  implementation  of  the  algorithm  does  not  use  any  particular  order  in  examin¬ 
ing  these  segments.  This  is  a  portion  of  the  algorithm  in  which  it  may  be  possible  to  take 
advantage  of  special  FLIR  imagery  characteristics  as  well  as  other  a  priori  information  in  order 
to  improve  segmentation  results.  Some  ideas  are  discussed  in  the  future  work  section. 

2.2.2.1.3.  Further  Region  Grouping  Based  on  ’’Nearest  Neighbor"  Criterion 

Now  the  image  consists  of  irregular  regions  in  which  the  difference  between  maximum 
and  minimum  gray  level  is  within  tolerance.  To  begin  the  next  step  of  the  processing,  the  aver¬ 
age  gray  level  of  each  region  is  co*  "uted.  In  this  step  the  criterion  used  is  "closeness"  in  terms 
of  average  gray  level.  All  adjactu.  legions  are  checked,  and  the  "closest",  if  it  is  "close"  enough, 
is  merged  with  the  region  in  question.  After  updating  the  average  gray  level  of  this  new  region, 
all  of  its  neighbors  are  ranked  according  to  closest  average  gray  level,  and  the  operation  is 
repeated  in  this  fashion  until  merging  is  no  longer  possible.  The  next  region  is  then  examined 
in  the  same  way,  and  this  is  repeated  until  all  regions  of  the  image  have  been  considered. 

Here  again  the  merging  operation  proceeds  from  region  to  region  in  an  arbitrary  order.  As 
for  the  previous  step,  possible  ordering  criteria  based  on  FLIR  imagery  characteristics  as  well  as 
other  a  priori  information,  are  discussed  in  the  future  work  section. 


Figure  2.2.17  Test  input  image,  and  "segmentation"  by  split  and  merge  step.  (Arbitrary  gray 
level  assignment  to  distinguish  regions.) 
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22.2.2.  Some  Segmentation  Results 

To  test  the  flexibility  of  the  algorithm  in  segmenting  different  types  of  FLIR  imagery, 
several  representative  FLIR  images  taken  from  the  BRITT  data  set  are  segmented  below  (see 
Figures  2.2.18  -  2.2.21).  Table  2.2.1  summarizes  the  performance  of  the  segmenter  for  each  of 
these  images. 

As  seen  in  these  examples  the  segmentation  results  of  the  algorithm  can  be  quite  noisy. 
Especially  for  low  contrast  images.  Most  non-target  regions  may  be  eliminated,  however,  by 
invoking  size  constraints,  perimeter- area  ratio  constraints,  etc. 

Although  a  sophisticated  algorithm  was  not  implemented  here,  a  very  simple  cleanup 
using  size  and  brightness  constraints  was  used  to  enhance,  in  most  cases,  the  segmentation 
results.  The  size  constraint  discards  regions  outside  the  100  to  2500  pixel-area  range. 

NOTE:  The  brightness  constraint  applied  here  discards  regions  with  an  average  gray  level 
below  that  of  the  overall  image.  This  could  not  be  used  in  general  since  targets  may  occasion¬ 
ally  be  darker  than  background. 

One  of  the  most  important  features  of  this  segmentation  algorithm  is  its  ability  to  segment 
regions  within  regions.  This  is  shown  in  Figure  2.2.19  where  the  hot  engine  of  the  truck  is  seg¬ 
mented  from  the  truck  body.  Such  information  could  prove  to  be  very  valuable  in  later  process¬ 
ing  where  "hot  spot"  locations  within  a  target  may  be  very  useful  for  classification  purposes. 

NOTE:  This  information  is  not  retained  by  the  cleanup  process  in  these  examples  simply 
because  the  routine  is  not  sophisticated  enough  to  recognize  such  situations. 

It  is  important  to  note  that  the  quality  of  the  segmentation  results  presented  here  is  due  to 
some  experience  in  selecting  thresholds  for  the  criteria.  The  results  are  very  sensitive  to  the 
selection  of  these  thresholds. 

2.2.2.3.  Comparison  with  EGT  and  Hughes  Segmenters 

To  determine  how  well  this  segmenter  really  performs  it  was  run  on  the  same  three  sets  of 
50  images  that  the  EGT  and  Hughes  segmenters  were  run  in  Section  2. 1.1.1.  The  results  appear 
in  Figures  2.2.22  -  2.2.30. 

2.2.2.3.I.  Threshold  Selection 

The  original  FLIR  images  were  separated  according  to  the  characteristics  mentioned  in 
Table  2.2.1  (i.e.  contrast,  striatedness,  etc.),  and  thresholds  were  chosen  based  on  experience. 
This,  admittedly,  gives  the  tree  traversal  algonthm  an  advantage  over  the  constant  threshold 
used  in  the  EGT  results.  About  ten  segmentations  were  improved  by  it. 

As  before,  a  cleanup  routine  was  run  on  the  segmentation  results.  Since  the  routine  is  not, 
nor  is  it  meant  to  be,  very'  sophisticated,  it  sometimes  discards  some  of  the  small  interior  regions 
of  targets.  Recall  that  the  ability  to  extract  regions  from  within  regions  is  one  of  the  more 
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A  "hot"  and  a  "cool"  target. 
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Figure  2.2.20  Multiple  targets. 
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Figure  2.2.21  Very  difficult  segmentations 


Figure  2.2.22  Composite  of  50  original  FLIR  images  of  APC’s  from  the  BRITT  data  set. 


Figure  2.2.23  Composite  of  the  APC  segmentation  results. 
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Figure  2.2.24  Composite  of  the  APC  results  after  cleanup. 


Figure  2.2.25  50  original  FLIR  images  of  tanks  from  the  BRITT  data  set. 


Figure  2.2.26  Composite  of  the  tank  segmentation  results. 
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Figure  2.2.27  Composite  of  the  tank  results  after  cleanup. 


Figure  2.2.28  Composite  of  50  original  FLIR  images  of  trucks  from  the  BRITT  data  set. 


Figure  2.2.29  Composite  of  the  truck  segmentation  results. 
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Figure  2.2.30  Composite  of  the  trucV-  results  after  cleanup. 


Table  2.2.1  Comments  on  the  tree  traversal  segmenter’s  performance  on  the  BRITT  data. 


Images 

Ground  Truth 

j  Thresholds 

Comments 

britt040 

ape  dead  center 

intax  -  mink  10 
aver,  thresh.  =  40 

This  is  a  high  contrast  image  which  means 
there  is  a  large  difference  between  target 
and  background  average  gray  level.  A  high 
average  criterion  threshold  produces  a  sharp 
segmentation. 

britt238 

ape  dead  center 

Imax  -  mink  10 
aver,  thresh.  =  5 

This  is  a  low  contrast  image  so  the  average 
criterion  threshold  is  much  lower  causing 
the  background  to  be  broken  into  multiple 
regions.  Most  of  these  regions  arc  removed 
by  the  cleanup  routine,  however. 

brillI37 

brill029 

truck  dead  center 

Imax  -  mink  10 
aver,  thresh.  =  20 

Within  these  targets  there  are  "warm"  and 
"hot"  regions.  The  engine,  for  example,  is 
much  warmer  than  the  rest  of  the  target. 
The  average  criterion  is  selected  to  separate 
the  target  from  the  background  while  also 
separating  the  engine  from  the  target  body. 
This  ability  to  segment  a  target  into  regions 
is  a  very  nice  property  that  may  be  useful 
for  classification  purposes. 

brill277 

tank  dead  center 
tank  lower  left 

. 

Imax  -  mink  10 
aver,  thresh.  =  20 

Here  there  are  many  bright  spots  visible  to 
the  naked  eye.  The  average  criterion  thres¬ 
hold  causes  the  background  to  be  broken 
into  multiple  regions.  Again,  most  of  these 
arc  removed  by  the  cleanup  routine,  but 
there  is  apparently  some  background  clutter 
which  docs  not  disappear.  Also,  note  that 
the  dark  region  in  the  center  of  the  original 
image,  and  "pulled  out”  by  the  segmenta¬ 
tion,  is  discarded  by  the  cleanup  routine. 
This  could  be  the  tread  portion  of  the  tank, 
and  therefore,  should  not  have  been  ignored. 

brilt5 1 5 

ape  dead  center 
lank  top  center 

Imax  -  minl=20 
aver,  thresh.  =  20 

The  striated  nature  of  this  image  calls  for 
modification  of  the  min/max  criterion  thres¬ 
hold.  The  criterion  is  increased  from  that 
used  in  the  previous  images  in  order  to 
bridge  the  striated  regions. 

britt347 

tank  dead  center 

Imax  -  minklO 
aver,  thresh.  =  20 

The  target  region  was  found  in  the  segmen¬ 
tation.  However,  it  was  removed  by  the 
cleanup  routine  since  it  was  smaller  than 
100  pixels. 

britt003 

ape  dead  center 

Imax  -  minl=6 
aver,  thresh,  =  5 

Here  the  contrast  is  simply  to  low  to  pick 
out  the  target 
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desirable  aspects  of  this  segmentation  algorithm.  For  this  reason  the  cleanup  routine  was  given 
a  size  constraint  of  20  to  2500  pixels  for  this  experiment.  Of  course  the  smaller  lower  bound 
tends  to  increase  the  number  of  extraneous  regions  retained. 

2.2.2.3.2.  Comparison  Summary 

Following  the  cleanup  routine  the  segmentation  results  tend  to  be  comparable  to  those  of 
the  EGT  and  Hughes  segmenters.  This  is  impressive  since  the  algorithm  is  very  general  and  not 
specifically  written  for  FLIR  segmentation.  In  addition  to  this,  the  algorithm  also  segments  hot 
and  cold  regions  within  a  target.  In  other  words,  multiple  levels  of  segmentation  are  possible. 
Not  just  target  and  non-target  distinctions. 

In  contrast  to  the  EGT  segmenter,  however,  this  algorithm  is  very  sensitive  to  threshold 
selection.  This  could  make  the  segmenter  difficult  to  use  in  an  ATR  environment  since  thres¬ 
holds  must  be  chosen  very  accurately  in  order  to  produce  useful  results. 

2.2.3.  Future  Work 

As  mentioned  during  the  discussion  of  the  tree  traversal  segmentation  algorithm,  portions 
of  the  implementation  could  be  modified  to  take  advantage  of  FLIR  characteristics  and  any 
other  a  priori  information.  For  example,  if  interest  points  within  the  image  are  identified  prior 
to  segmentation  attempts,  this  information  could  be  used  to  direct  the  region  growing  of  steps 
two  and  three.  Points  of  interest  would  be  examined  first  during  these  region  growing 
processes,  and,  especially  for  low  contrast  imagery,  this  would  reduce  the  possibility  of 
"interest  regions"  being  absorbed  into  the  background.  With  regard  to  the  classification  prob¬ 
lem,  the  ability  to  segment  hot  and  cold  regions  within  a  target  will  probably  become  important. 
Engine  location,  for  example,  will  be  very  useful  information  when  determining  target  class.  Of 
course  before  the  tree  traversal  segmentation  algorithm  can  be  used  in  an  ATR  environment, 
methods  of  automatic  threshold  selection  must  be  developed. 

We  now  have  available  three  segmenters  which  work  well  with  FLIR  imagery.  In  the 
future  we  will  be  using  these  segmenters  (one  or  more  of  them)  to  assist  in  fusing  the  FLIR  data 
with  the  LADAR  data. 

2.3.  THE  USE  OF  HIGH  LEVEL  REASONING  TO  IMPROVE  THE  CLASSIFICA¬ 
TION  OF  FLIR  DATA 

Current  production  algorithms  for  processing  FLIR  images  work  in  a  bottom-up  approach. 
That  is,  they  start  at  the  pixel  level  and  segment  the  image  into  regions  and  then  label  (i.e.  clas¬ 
sify)  each  region  based  on  its  contents.  The  labeling  of  a  given  region  is  generally  done  in  a 
context-independent  manner  without  consulting  the  spatial  inferences  in  neighboring  regions. 
More  consistent  labeling  and  therefore  better  recognition  accuracy  may  be  obtainable  by  using 
global  information  about  the  scene.  Such  an  approach  could  examine  each  region  individually 
and  give  each  a  label  and  a  confidence  value  for  that  label.  Next  all  the  regions  could  be 
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examined  io  see  if  the  labeling  is  consistent.  If  the  labeling  is  not  consistent,  hypotheses  for 
relabeling  could  be  generated  based  on  partial  evidence  in  each  region  and  global  information 
about  the  scene.  The  regions  could  then  be  reexamined  to  try  to  verify  these  hypotheses. 

Hypothesize-and-verify  approaches  are  difficult  to  incorporate  in  purely  bottom-up  algo¬ 
rithms,  therefore  we  are  investigating  the  use  of  hierarchical  data  structures  which  will  lend 
themselves  to  the  construction  of  scene  hypothesis  at  different  scales.  We  are  currently  investi¬ 
gating  LoG  channels  (Laplacian-of-Gaussian),  pyramids,  and  quadtrees  which  could  comprise 
the  lower  levels  of  a  hierarchy. 

Section  2.3.1.  presents  some  possible  symbolic  data  structures  and  discusses  the  advan¬ 
tages  of  each.  Section  2.3.2.  presents  the  various  pixel  level  data  hierarchies.  At  some  point  we 
feel  it  will  be  necessary  to  switch  from  a  pixel  description  of  an  image  to  a  symbolic  description 
where  each  location  in  the  image  will  have  information  describing  which  higher  level  objects  in 
the  scene  it  is  part  of.  Such  information  will  greatly  help  the  reasoning  process.  Section  2.3.3. 
presents  a  high-level  reasoning  system  which  converts  edge  descriptions  to  symbols  and  uses 
global  information  to  construct  hypothesis  to  aid  in  make  the  edge  labeling  consistent  in  a 
scene. 

2.3.1.  Data  Structures  for  Symbolic  Reasoning 

The  following  is  a  brief  description  of  some  preliminary  ideas  for  a  data  structure  for  sym¬ 
bolic  reasoning.  This  symbolic  list  provides  a  link  between  the  pixel  level  segmentation  and  the 
high  level  reasoning  by  taking  pixel  level  features  (such  as  lines,  edges,  and  regions)  and  storing 
them  in  a  data  structure  so  that  a  high  level  reasoning  process  can  easily  access  related  informa¬ 
tion.  Once  an  image  is  decomposed  into  edges  and  regions,  graph  theory  provides  a  natural 
means  for  describing  the  image.  The  following  section  gives  some  common  graph  theory 
definitions  which  will  then  be  used  in  describing  the  symbolic  data  structure. 

2.3.1.1.  Some  Graph  Theory  Definitions 

A  segmented  image  may  be  thought  of  as  a  planar  graph.  There  are  edges  (line  segments), 
vertices  (line  segment  intersection  points),  and  faces  (regions).  The  following  paragraphs 
describe  these  objects. 

An  edge  is  a  line  segment  terminated  by  two  vertices  or  by  a  single  vertex  if  it  is  a  loop. 
A  vertex  is  the  endpoint  of  a  single  edge,  an  arbitrary  single  point  in  a  closed  edge  (loop),  or  the 
intersection  point  of  three  or  more  edges.  If  the  degree  of  a  vertex  is  defined  as  the  number  of 
edges  incident  with  it,  each  loop  counting  as  two  edges  [BonMur76],  then  with  the  above 
definitions  of  edge  and  vertex  the  only  way  for  a  vertex  to  be  of  degree  2  is  if  it  is  part  of  a  loop. 
A  vertex  of  degree  1  is  simply  a  way  of  specify  the  endpoint  of  a  “dangling”  edge.  Aside  from 
the  above  two  cases  (the  termination  point  of  an  edge  or  its  intersection  with  itself),  the  only 
other  place  vertices  occur  is  at  the  intersection  of  3  or  more  edges.  This  is  important  in  that  no 
matter  how  many  crazy  turns  an  edge  makes,  a  vertex  will  only  be  defined  if  it  intersects 
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something.  This  keeps  90- 180  degree  bends  in  an  edge  from  being  defined  as  vertices. 

Let  us  consider  regions  next.  First  let  us  define  a  walk .  A  walk  is  a  non-null  sequence 
W  =  v0^iV!^2v2  ‘  ‘ '  ekvk  whose  terms  are  alternately  vertices  and  edges,  such  that  for  \<i<,k 
the  ends  of  e,  are  v,_!  and  v,.  vo  and  v*  are  the  origin  and  terminus  of  W,  respectively,  and  W 
is  said  to  be  of  length  k  [BonMur76].  A  region  therefore  may  be  specified  by  the  closed  walk 
(terminus  =  origin  =  v0)  defining  its  border.  Let  us  define  a  region  as  such  a  walk  W  of  length 
k  surrounding  area  A(W)  such  that  there  are  not  two  shorter  walks  X  of  length  i  and  Y  of  length 
j,  i  <  k  and  j  <  k,  with  A  (X)  A  (Y)  =  A  ( W ),  if  there  are  any  closed  walks  contained  in  walk 
W,  the  regions  defined  by  those  walks  are  not  included  in  the  region  defined  by  W.  (i.e.  regions 
are  non-overlapping.) 

2.3.I.2.  The  Data  Structures 

Figure  2.3.1  shows  a  target  which  has  been  segmented  into  edges,  vertices,  and  regions, 
each  of  which  has  been  labeled.  The  edges,  vertices,  and  regions  of  any  image  which  has  been 
segmented  as  in  Figure  2.3.1  can  be  represented  symbolically  by  the  following  data  structures. 

1.  A  List  of  Edges 

The  edge  list  stores  a  count  of  the  number  of  edges  in  the  image  and  has  a  pointer  to  a 
linked  list  containing  all  the  edges.  The  edge  label,  length  of  edge,  and  the  location  of 
endpoints  are  stored  for  each  edge  as  shown  in  Figure  2.3.1. 

2.  A  List  of  Vertices 

The  vertices  list  keeps  a  count  of  the  number  of  vertices  and  points  to  a  linked  list  of  them. 
The  degree  of  a  given  vertex,  the  vertex  label,  and  the  location  of  the  vertex  are  stored  for 
each  vertex  (see  Figure  2.3.1). 

3.  A  List  of  Regions 

The  regions  list  contains  a  count  of  the  regions  in  the  image  and  a  pointer  to  a  linked  list 
of  regions,  each  containing  the  region  label,  a  count  of  edges,  and  a  pointer  to  a  linked  list 
of  edges  and  vertices  tracing  the  walk  that  defines  the  region.  (See  Figure  2.3.1.) 

The  three  lists  above  provide  a  compact  means  of  storing  the  individual  edge,  vertex,  and 
region  data.  The  following  tables  store  information  about  the  relationships  between  the  various 
edges,  vertices,  and  regions. 

1 .  Adjacency  Table 

The  adjacency  table  (for  an  example  see  Table  2.3. 1  and  Figure  2.3.1)  has  one  column  and 
one  row  for  each  vertex  in  the  image.  Entry  adjacency[i][/J  gives  the  number  of  edges 
joining  v,  and  vy.  This  table  is  useful  for  finding  loops  since  a  non  zero  entry  at 
adjacency!/] [/]  shows  that  vertex  v,  is  part  of  a  loop. 

2.  Region  Boundary  Table 

The  boundary  table  (see  Table  2.3.2  for  an  example)  has  one  column  for  each  edge  and 


length 


Figure  2.3.1  Continued. 
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Table  2.3.1  The  adjacency  table  for  Figure  2.3.1.  It  has  one  row  and  one  column  for  each 
vertex. 

vertices 
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0 

1 

one  row  for  each  region  in  the  image.  If  boundary[/][/]  =  1,  edge  ej  is  part  of  the  boun¬ 
dary  of  region  rt. 

This  table  can  be  used  to  easily  compute  the  number  of  edges  around  region  r-t  by 
summing  the  entries  in  row  /.  It  can  also  be  used  to  see  if  a  region  completely  contains 
another  region.  To  test  to  see  if  region  rt-  is  completely  surrounded  by  another  region,  find 
a  list  of  edges  bounding  r,  by  looking  for  the  set  of  edges,  E,  which  have  a  1  in  row  r,-  of 
the  boundary  table.  If  each  edge  in  E  bounds  the  same  region,  r;*r,,  then  r;  completely 
surrounds  r,. 

3.  The  Incidence  Table 

The  incidence  table  (Table  2.3.3  is  an  example)  has  one  column  for  each  edge  and  one 
row  for  each  vertex.  incidence[/][/]  equals  the  number  of  times  v,  and  ej  are  incident. 

2.3.I.3.  An  Example  of  Reasoning  with  the  Symbolic  Data  Structure 

A  number  of  useful  reasoning  processes  can  be  easily  supported  by  the  above  lists  and 
tables.  For  example,  suppose  we  are  trying  to  recognize  the  target  in  Figure  2.3.1.  Suppose  a 
rule  in  our  knowledge  base  was: 

Due  to  the  nature  of  treads  on  a  tank,  you  will  find  several  loops  in  a  horizontal  line  near 
the  bottom  of  the  target. 
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Table  2.3.2  The  boundary  table  for  Figure  2.3.1.  It  has  one  column  for  each  edge  and  one  row 
for  each  region. 

edges 


boundary 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

0 

1 

1 

0 

1 

1 

0 

1 

0 

1 

0 

0 

0 

0 

0 

1 

1 

0 

1 

1 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

2 

0 

1 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

3 

0 

0 

0 

0 

1 

1 

1 

1 

0 

0 

0 

0 

0 

0 

4 

0 

0 

0 

0 

0 

0 

0 

1 

1 

1 

1 

1 

1 

1 

5 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

6 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

7 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

0 

8 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

9 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

Table  2.3.3  The  incidence  table  for  Figure  2.3.1.  It  has  one  column  for  each  edge  and  one 
row  for  each  vertex. 
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Our  reasoner  wants  to  find  support  for  this  rule,  so  it  looks  for  loops  by  consulting  the  diagonal 
of  the  adjacency  table.  It  finds  that  V7  -  v  n  are  all  loops  (they  all  have  a  non  zero  value  on  the 
diagonal).  The  incidence  table  shows  that  v-j  -  v  u  are  incident  on  edges  e  10  *  e  14  respectively. 
Consulting  the  boundary  table  shows  that  edges  e  10  -  e  14  ail  bound  region  r $  and  individually 


bound  regions  r$  -  r$  respectively.  This  means  r$  -  rg  are  all  completely  contained  in  r 4. 
Consulting  the  coordinates  for  e  10  -  e  14,  (stored  in  the  edge  list),  shows  that  e  10  -  e  14  are  hor¬ 
izontal.  Therefore  we  have  found  support  for  the  above  rule  by  simply  consulting  the  lists  and 
table. 

2.3. 1.4.  Conclusions 

We  have  described  the  symbolic  structures  we  are  currently  using  and  have  shown  that 
they  are  useful  for  supporting  the  queries  of  a  reasoning  process.  These  structures  will  change 
as  different  methods  of  reasoning  are  tried  and  weaknesses  in  the  structures  are  found.  It  is  our 
hope  that  combining  the  symbolic  data  structures  with  the  hierarchical  methods  will  result  in  a 
powerful,  yet  compact,  method  for  storing  the  relevant  information  in  an  image  so  that  it  can  be 
used  by  some  sort  of  high  level  reasoning  program. 

2.3.2.  Pixel  Level  Hierarchical  Data  Structures 

The  lower  few  levels  of  our  hierarchy  will  probably  not  contain  any  symbolic  information, 
instead  it  will  contain  the  low  level  pixel  information.  The  following  two  sections  discuss  three 
popular  data  structures  for  pixel  level  hierarchies.  Section  2. 3. 2. 3.  approaches  the  problem  of 
determining  where  in  a  hierarchy  (which  level)  to  start  looking  for  targets. 

2.3.2.1.  LoG  Channels 

LoG  channels  are  a  popular  research  topic  because  they  process  images  much  like  the 
human  visual  system  does.  A  LoG  channel  is  computed  by  taking  a  Laplacian  of  a  Gaussian 
transform;  the  Laplacian  is  the  orientation  independent  second  order  derivative  operator,  and 
the  Gaussian  provides  a  tunable  smoothing,  i.e.  it  allows  one  to  select  how  much  smoothing  to 
use.  Figure  2.3.2  shows  a  LoG  model  of  the  human  visual  system.  Initially  processing  is  done 
at  the  coarsest  levels  where  everything  becomes  blob-like.  At  this  level  blobs  in  the  left  image 
are  easily  matched  with  blobs  in  the  right  image.  As  more  information  is  needed,  the  higher 
resolution  levels  are  consulted.  Figure  2.3.3  shows  a  FLIR  image  and  four  of  its  LoG  images. 
A  hierarchical  vision  system  would  examine  each  of  these  representations  and  try  to  construct 
hypotheses  about  the  contents  of  the  scene. 

It  is  interesting  to  note  that  while  it  may  be  difficult  to  hypothesize  the  existence  of  a  road 
at  the  lowest  channels  (because  the  lowest  channels  are  more  dominated  by  small  pixel-to-pixel 
variations  and  are  less  capable  of  grouping  together  pixels  to  construct  large  scale  detail),  the 
same  task  could  prove  relatively  easy  in  a  coarser  channel. 

2.3.2.2.  Pyramids  and  Quadtrees 

Pyramids  and  quadtrees  may  be  the  most  promising  hierarchical  data  structures  for  reason¬ 
ing  in  computer  vision  because  they  are  naturally  a  hierarchy  with  less  information  stored  in  the 
upper  level  and  more  details  stored  at  the  lower  levels.  Such  an  arrangement  allows  for  quick 
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Figure  2.3.2  LoG  channel  model  of  Human  visual  system. 
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location  of  relevant  parts  of  the  image  while  ignoring  irrelevant  details. 

Figure  2.3.4  shows  a  256  x  256  FLIR  image  of  three  targets  which  is  mapped  to  each 
higher  level  by  taking  an  average  over  the  four  pixels  below  it.  A  hierarchical  vision  system 
would  scan  the  4  x  4,  8  x  8,  and  16  x  16  level  and  not  find  much  of  interest.  This  is  because 
there  is  nothing  large  in  the  image.  The  bright  spots  and  dark  spots  at  the  32  x  32  level  may 
contain  something  of  interest.  The  pyramids  below  these  spots  could  then  be  examined  in 
hopes  of  finding  a  target.  The  pyramid  has  reduced  the  search  space  from  blindly  searching  all 
256  x  256  =  64k  pixels  of  the  original  image  to  searching  4x4  +  8x8  +  16x  16  +  32  x  32  = 
1360  pixels.  After  searching  these  pixels  the  search  was  directed  to  area  of  the  image  that  were 
most  likely  to  contain  information  of  interest. 

This  example  has  shown  how  a  pyramid  constructed  by  averaging  pixels  as  a  map  between 
levels  can  reduce  the  number  of  pixels  to  be  searched.  (A  similar  example  could  be  constructed 
for  quadtrees.) 

2.3.2.3.  Classification  on  a  Pyramid 

A  question  to  ask  is,  How  do  you  know  which  level  to  start  searching  on?  To  answer 
this  question  we  constructed  pyramids  out  of  each  of  the  targets  used  in  the  BRITT  data 
classification  experiment.  Each  level  of  each  of  the  pyramids  were  constructed  by  averaging 
the  four  pixels  from  the  level  below  as  discussed  in  the  previous  paragraphs.  We  then  ran  the 
classification  experiments  on  each  of  the  levels.  We  had  hoped  to  see  the  classification  errors 
rise  slowly  as  we  moved  up  the  pyramid  since  there  was  less  information  at  each  level.  Then 
we  expected  to  see  the  error  rate  jump  once  we  hit  a  level  where  too  much  information  was 
averaged  out.  However,  Table  2.3.4  shows  that  results  of  are  not  the  results  we  expected. 


Table  2.3.4  Results  of  classification  experiments  on  pyramid  built  from  BRITT  data.  The 
values  in  the  table  are  the  percentage  of  the  targets  misclassified. 


level 

feature  set  1 
upper  lower 

feature  set  2 
upper  lower 

feature  set  3 
upper  lower 
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59.6 
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61.3 

5^  .9 

49.3 
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61.3 

61.3 
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59.1 

46.2 

54.7 

64.9 

64.9 

The  results  show  that  the  error  rates  do  not  really  increase  as  we  move  up  the  pyramid  to  levels 


(g)  (h) 


Figure  2.3.4  Different  levels  of  a  pyramid  representation  of  a  320  x  496  FLIR  image  padded 
to  512  x  512.  (a)  4  x  4,  (b)  8  x  8,  (c)  16  x  16,  (d)  32  x  32,  (e)  64  x  64,  (f)  128  x 
128,  (g)  256  x  256,  (h)  original  512x512  image. 
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with  less  information.  One  could  hastly  conclude  that  the  important  information  is  preserved 
from  level  to  level  in  the  pyramid.  However,  take  for  example  the  62.3%  lower  bound  error 
rate  for  feature  set  1  on  level  zero.  This  is  a  very  poor  error  rate  (as  discussed  in  Section 
2. 1.1. 4.4.)  the  error  rate  is  so  poor  on  level  0,  it  should  be  no  surprise  that  it  does  not  gei  much 
worse  at  higher  levels  simply  because  it  is  hard  to  make  any  more  errors.  With  three  classes 
to  pick  from,  a  random  guess  would  result  in  an  error  rate  of  66.6%.  And  level  0  already  has  an 
upper  bound  of  62.3%. 

One  can  draw  one  more  conclusion  from  these  results:  There  is  a  mismatch  between  the 
digital  sampling  rates  implied  by  the  matrix  sizes  used  for  FLIR  data  and  the  intrinsic  resolution 
of  a  FLIR  sensor.  As  demonstrated  by  the  feature  set  2  above,  a  reduced  representation  of  a 
FLIR  image  on  a  much  smaller  matrix  does  not  cause  any  noticeable  reduction  in  the 
classification  information. 

2.3.3.  Using  a  Global  Map  to  Improve  Edge  Labeling 

FLIR  images  rarely  have  enough  clarity  and  pixels  on  target  to  extract  and  accurately  label 
detailed  edges  as  shown  in  Figure  2.3.1.  More  often  the  line  segments  will  consist  of  several 
edges  separated  by  gaps,  edges  with  many  noise  edges  attached,  or  both.  Many  schemes  have 
been  devised  to  eliminate  noise  edges  and  close  gaps  between  edges  in  the  same  line.  A  better 
approach  to  the  problem  is  to  include  global  information  about  the  area  being  processed.  For 
example  assume  we  have  a  global  map  of  the  area  being  viewed  by  the  sensor  and  a  rough  idea 
of  where  the  sensor  is  in  the  map.  The  map  would  contain  objects  such  as  rivers,  roads,  lakes, 
mountains,  etc.  The  location  of  the  sensor  could  be  found  by  some  global  navigation  system. 
The  map  and  approximate  location  are  then  used  to  build  an  approximate  model  of  what  is 
being  viewed  by  the  sensor  and  the  model  is  used  to  guide  the  labeling  of  the  edges.  The 
regions  bounded  by  the  edges  could  be  labeled  as  sky,  mountains,  plains,  forest,  etc.  Such 
classification  could  then  be  used  to  direct  the  target  detector  where  to  look  for  targets. 

In  the  Robot  Vision  Lab  we  are  building  a  general  purpose  software  tool  called  PSEIK1 
[AndKak87]  (a  Production  System  Environment  for  Integrating  Knowledge  with  Images).  Our 
plan  is  to  apply  PSEIKI  to  NVL  projects  where  symbolic  reasoning  and  integration  with  world 
knowledge  can  be  applied.  PSEIKI  is  a  rule  based  system  written  in  OPS. 

The  following  section  describes  a  system  which  takes  some  images  at  various  locations  on 
a  sidewalk  and  uses  a  global  map  of  the  sidewalk  to  help  label  the  edge  segments  which  belong 
to  each  edge  of  the  sidewalk.  We  could  have  used  FLIR  data  however,  we  do  not  have  any  glo¬ 
bal  maps  which  correspond  with  the  FLIR  images  in  our  database. 

2.3.3. 1.  System  Goals 

The  first  goal  of  PSEIKI  is:  given  an  approximate  location  on  the  sidewalk  and  a 
corresponding  view,  find  the  edges  of  the  sidewalk.  Figure  2.3.5  shows  the  kinds  of  views  that 
were  taken  of  the  sidewalk.  The  images  on  the  right,  used  for  drawing  inferences  about  the 


Figure  2.3.5  Typical  images  to  be  processed  by  PSEIKI. 
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location  and  the  direction  of  the  sidewalk,  were  obtained  by  applying  edge  detection  and  thin¬ 
ning  operators  to  the  images  on  the  left.  The  architecture  for  the  edge  labeling  process  is  shown 
in  Figure  2.3.6.  The  process  has  two  subsections:  the  preprocessor/pixel-to-symbol  converter 
and  a  rule-based  edge  labeler. 

2.3 3.2.  Preprocessing  and  Conversion  to  Symbolic  Form 

The  preprocessor  accepts  digitized  images  and  outputs  binary  edges  suitable  for  conver¬ 
sion  into  symbolic  form.  The  edges  are  detected  by  applying  a  Sobel  operator  to  the  digitized 
gray  scale  image.  These  edges  are  then  thinned  via  Eberlein’s  algorithm  [Eber76]  and  thres- 
holded.  The  resulting  binary  images  are  thinned  again  to  produce  edges  that  are  at  most  one 
pixel  wide.  Small  edges  are  also  deleted  by  the  preprocessor.  At  this  point,  the  image  is  ready 
to  be  converted  into  symbolic  form. 

The  conversion  to  symbolic  form  is  accomplished  via  an  algorithm  based  on  the  Nevatia- 
Babu  line-finder  [NeBa80].  In  this  process,  the  following  steps  are  performed.  First,  some  pix¬ 
els  are  labeled  as  vertices.  The  pixels  so  labeled  are  edge  endpoints  and  the  points  at  which  two 
or  more  edges  intersect.  The  edges  in  the  segmented  image  are  then  traced  from  the  starting  to 
ending  vertices  and  are  represented  as  broken  line  segments.  After  each  edge  is  converted  to 
symbolic  form,  it  contains  the  following  information:  edge  number,  start  vertex,  end  vertex, 
length  and  strength  (average  gradient  magnitude).  Likewise,  each  vertex  contains  the  following 
information:  row  coordinate,  column  coordinate,  vertex  number  and  degree. 

2.3.3.3.  Rule  Based  Edge  Finder 

The  rule-based  labeling  system  is  written  in  OPS5  [BFKM86]  and  is  split  into  three  subsys¬ 
tems.  The  first  subsystem  is  statement  driven  and  does  not  employ  an  inexact  reasoning 
scheme.  Its  purpose  is  to  overcome  segmentation  deficiencies  and  reduce  the  amount  of  data 
seen  by  the  sections  that  do  use  inexact  reasoning.  There  are  two  main  ways  that  the  current 
segmentation  is  deficient.  First  the  segmentation  procedure  leaves  small  edges  caused  by  noise. 
Although  many  of  these  edges  are  eliminated  during  the  segmentation  process,  others  still 
remain  because  they  are  connected  to  longer  segments.  These  noise-edges  often  look  like  the 
example  in  Figure  2.3.7.  The  first  section  of  the  expert  system  eliminates  these  “dangling” 
edges  (all  segments  which  are  shorter  than  a  specified  length  and  have  a  degree  one  vertex). 
Figure  2.3.8  is  a  pseudo-code  example  of  one  the  rules  used  to  delete  a  small,  dangling  edge. 


IMAGE  PREPROCESSOR  IN  C 


Snhftl  I — ptpv  thin  I — thrfvchnlH  I — hinarv  thin  I — frl  nnvvprt 


(A)  :  Data  Reduction 

(B)  :  Update  beliefs  using  local  consistency 

(C)  :  Update  beliefs  using  global  consistency 


Figure  2.3,6  Architecture  for  edge  labeling  process. 
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RULE:  remove-small-segments 

IF:  current  context  is  remove-small-segments 
AND:  segment  with  name  <small-segment>  has 
length  <  4  and 

start  vertex  with  name  <start>  and 
end  vertex  with  name  <end> 

AND:  vertex  with  name  <start>  has 
degree  1 

THEN:  delete  segment  < small -segment > 

AND:  make  contexts  to  decrement  the  degrees 
of  vertices  <start>  and  <end> 

Figure  2.3.8  Rule  to  delete  small  edges. 


The  segmentation  used  also  produces  artifacts  that  break  lines  into  smaller  line  segments.  The 
system  tries  to  compensate  for  this  fact  by  rejoining  these  broken  line  segments.  Finally  the 
first  section  combines,  into  a  single  line,  segments  that  are  joined  at  a  degree-two  vertex.  This 
action  is  demonstrated  in  Figure  2.3.9  and  the  pseudo-code  for  a  rule  that  performs  the  merging 
action  is  shown  in  Figure  2.3.10. 

RULE:  merge- joined- segments 

IF:  current  context  is  merge- joined-segments 
AND  vertex  with  name  <common~vertex>  has  degree  2 
AND  segment  with  name  <segmentl>  ends  at  vertex  <common-vertex> 
AND  segment  with  name  <segment2>  ends  at  vertex  <common-vertex> 
THEN:  remove  <common-vertex>,  <segmentl>,  <segment2> 

AND  create  a  new  segment  to  replace  <segmentl>  and  <segment2> 

Figure  2.3.10  Rule  to  merge  connected  segments  into  a  single  segment. 


The  overall  result  of  these  subprocesses  is  a  cleaner  image  containing  a  substantially  reduced 
number  of  line  segments.  Experimental  results  demonstrate  that  the  amount  of  pruning  is 
between  66%  and  75%;  this  is  obviously  a  large  reduction  in  the  amount  of  data.  Figure  2.3.11 
shows  an  example  of  input  to  and  output  from  the  first  section  of  the  expert  system. 

The  second  subsystem  performs  segment  labeling  and  confidence  estimation.  The  system 
uses  the  expected  position  in  the  global  map  to  estimate  where  it  should  see  the  sidewalk’s 
edges  in  its  field  of  view,  (for  example,  if  the  sidewalk  is  expected  to  have  a  right  hand  turn,  it 
should  see  the  left  and  right  edges  for  the  sidewalk  before  the  turn  and  the  top  and  bottom  edges 
for  the  sidewalk  after  the  turn.)  The  expert  system  was  designed  to  use  the  position  of  expected 
(model)  edges  to  label  those  found  in  the  segmented  image.  The  edge  labels  assigned  consist  of 
the  following  information:  the  name  of  the  corresponding  model  edge  and  a  certainty  factor 
between  1  and  -1  describing  the  confidence  attached  to  that  labeling.  All  segments  are  initially 


Figure  2.3.1 1  Example  of  the  first  section  of  the  expert  system,  (a)  Input  image,  (b)  Output  of 
preprocessor,  (c)  Output  of  first  section  of  expert  system. 
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labeled  as  the  model  edge  to  which  they  correspond  most  closely. 

The  initial  labels  and  confidence  estimates  are  based  on  the  collinearity  between  the  edge 
in  question  and  the  model  edge.  Collinearity  is  defined  as  follows 

Col  ( <segment  1  >,  <segmetit2>)  =  — — — —  cos  0 

^  max 

where  0  is  the  angle  between  the  detected  and  the  model  segment;  Dseg  is  the  distance  from  the 
middle  of  the  segment  to  the  model  edge  and  D  max  is  the  maximum  possible  distance  from  the 
model  edge  to  the  detected  edge  (see  Figure  2.3.12).  If  the  edge  is  further  than  Dmax  away  from 
the  model  edge,  the  collinearity  is  defined  to  be  zero. 

After  the  initial  labels  are  assigned,  they  can  not  be  changed.  However,  every  segment’s 
confidence  value  is  updated  based  on  the  consistency  between  the  labels  of  all  other  segments 
and  the  expected  geometry.  The  belief  that  a  segment’s  label  is  correct  based  on  new  evidence 
is  computed  separately  from  the  confidence  that  it  is  incorrect  (its  disbelief).  New  and  old  evi¬ 
dence  is  combined  using  a  variant  of  the  Dempster-Shafer  theory  of  evidence  [Shaf76].  The 
following  rules  demonstrate  how  the  (dis)belief  that  a  segment’s  label  is  correct  is  updated 
using  labels  of  other  segments.  If  a  segment  is  thought  to  lie  on  the  same  model  edge,  the 
confidence  that  the  label  is  correct  is  increased  if  the  segments  are  highly  collinear.  This  is 
expressed  as  shown  in  Figure  2.3.13 

RULE:  update-belief-compatible 

IF  current  context  is  update-certainty  of  <segmentl> 

AND:  label  of  <segmentl>  is  <labell> 

AND:  there  is  a  segment  named  <segment2> 
with  label  <label2>  and 
certainty  factor  <cf>  >  0,2 
AND:  the  angle  between  the  models 

with  labels  <labell>  and  <label2>  =  0 
THEN:  new  certainty  that  label  for  <segmentl>  is  correct  is  equal  to 
Col  (<segmentl>,  <segment2>)  *  <cf> 

Figure  2.3.13  Rule  used  to  update  belief  of  segments  believed  collinear. 

If  the  two  segments  are  believed  to  correspond  to  different  model  edges,  a  segment’s  confidence 
is  updated  based  on  how  closely  the  angle  between  the  two  segments  matches  the  corresponding 
model  angle.  The  new  belief  is  defined  as  the  cosine  of  the  difference  between  the  expected  and 
measured  angles  multiplied  by  the  confidence  value  of  the  updating  edge.  Figure  2.3.14  shows 
an  example  of  a  rule  that  is  used  to  update  a  segment’s  belief  if  the  two  segments  correspond  to 
different  model  edges. 
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RULE:  update-belief-incompatible 

IF  current  context  is  update-certainty  of  <segmentl> 

AND:  label  of  <segmentl>  is  <labell>  and 
length  <length> 

AND:  there  is  a  segment  named  <segment2> 
with  label  <label2>  that 

makes  an  angle  of  <true-angle>  with  <segmentl> 
and  has  a  certainty  factor  <cf>  >  0.2 
AND:  the  angle  between  the  models 

with  labels  <labell>  and  <label2>  is  <model-angle> 

THEN:  new  certainty  that  label  for  <segmentl>  is  correct  is  equal  to 
cos (<true-angle>  -  <model-angle>)  *  <cf>  *  <scale-factor> 


Figure  2.3.14  Rule  used  to  update  belief  of  segments  believed  noncollinear. 


Therefore,  if  the  angle  expected  between  the  edges  is  exactly  the  angle  found,  the  new  belief 
that  the  labeling  is  correct  is  equal  to  the  belief  that  the  updating  edge’s  label  is  correct.  Like¬ 
wise,  the  new  disbelief  is  defined  as  the  sine  of  the  difference  of  the  angles  scaled  by  the 
confidence  of  the  updating  edge’s  label.  This  is  demonstrated  by  the  rule  in  Figure  2.3.15. 


RULE:  update-disbelief-incompatible 

IF  current  context  is  update-certainty  of  <segmentl> 

AND:  label  of  <segmentl>  is  <labell>  and 
length  <length> 

AND:  there  is  a  segment  named  <segment2> 
with  label  <label2>  that 

makes  an  angle  of  <true-angle>  with  <segmentl> 
and  has  a  certainty  factor  <cf>  >  C.2 
AND:  the  angle  between  the  models 

with  labels  <labell>  and  <label2>  is  <model-angle> 

THEN:  new  certainty  that  label  for  <segmentl>  is  incorrect  is  equal  to 
sin (<true-angle>  -  <model-angle>)  *  <cf>  *  <scale-factor> 

Figure  2.3.15  Rule  used  to  update  disbelief. 


For  example,  if  the  expected  angle  between  segments  was  60°  but  the  measured  difference  was 
30°  the  new  disbelief  would  be  one  half  the  belief  that  the  updating  segment’s  label  was  correct. 
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The  third  subsystem  used  in  PSEIKI  is  not  yet  complete;  it  will  duplicate  the  processes 
used  in  the  first  two  but  will  work  on  groups  of  edges  instead  of  singletons.  Work  also  must  be 
done  to  incorporate  camera  calibration  information  into  the  reasoning  system;  this  will  remove 
distortions  due  to  perspective  imaging. 

2 .3.3.4.  Experimental  Data 

The  expert  system  was  run  on  data  from  a  number  of  digitized  images,  one  of  which  is 
presented  here.  The  image  shown  was  taken  slightly  to  the  left  of  the  center  of  of  a  straight  sec¬ 
tion  of  sidewalk.  This  tests  the  expert  system  in  the  case  of  a  positional  error.  Figure  2.3.11. a 
shows  the  unprocessed  image.  The  image  in  Figure  2.3.1  l.b  shows  the  output  of  the  image  seg¬ 
mented  as  can  be  seen  it  contains  a  large  number  of  broken,  noisy  line  segments.  The  output  of 
the  first  section  of  the  expert  system  is  shown  in  Figure  2.3. 1  l.c.  The  number  of  segments  was 
reduced  from  99  to  25  segments  in  this  image.  Figure  2.3.16  shows  the  segments  labeled  as  left 
and  right  edges  of  the  sidewalk  for  all  labels  with  positive  certainty.  As  can  be  seen  by  the 
results,  the  system  performed  admirably  on  this  image.  It  found  both  edges  of  the  sidewalk  and 
only  included  one  segment  that  was  not  a  sidewalk  edge. 

2.3.3 .5.  Conclusion 

The  present  expert  system  demonstrates  the  validity  of  using  a  production  system  to  imple¬ 
ment  a  geometric  reasoning  system.  Such  a  system  could  a  applied  to  FLIR  images  to  identify 
major  components  of  the  image  such  as  the  sky,  mountains,  forests,  etc.  Each  of  the  three  com¬ 
ponents  of  the  ATR  process  (detection,  segmentation,  and  classification)  co,0d  produce  more 
accurate  results  once  these  regions  are  found  and  correctly  labeled.  For  example,  the  detector 
would  know  not  to  look  in  the  sky  for  tanks.  The  segmenter  could  use  special  segmentation 
algorithms  tuned  to  the  type  of  region  it  is  segmenting.  Finally  the  classifier  could  use  the  type 
of  region  information  to  tune  its  classification  process. 


3.  LASER  RADAR  RANGE  DATA  PROCESSING 


We  believe  that  in  the  multi-sensor  ATR  systems  of  the  future,  LADAR  will  play  an 
important  role  for  verification  of  targets  detected  by  FLIR;  LADAR  will  also  play  an  important 
role  in  estimating  the  aspect  of  a  target  —  an  important  piece  of  information  about  the  hostile 
intent  of  the  target.  In  this  section,  we  will  discuss  the  extensive  amount  of  work  we  have  done 
on  segmentation  of  targets  from  LADAR  range  images;  we  will  also  report  on  the  exploratory 
work  carried  out  by  us  on  different  reasoning  strategies  that  could  be  employed  for  identifying 
targets  and  estimating  their  aspects.  It  appears  to  us  that  for  targets  positioned  within  a  kilome¬ 
ter  from  a  sensor  with  current  resolution  capability,  it  should  be  possible  to  carry  out 
identification  and  aspect  estimation  by  reasoning  in  terms  of  target  surfaces  and  their  inter¬ 
relationships.  However,  for  targets  farther  than  a  kilometer,  sensors  of  today  do  not  permit  a 
reliable  extraction  of  surfaces  and  recognition  and  aspect  estimation  must  depend  upon  just  the 
silhouette. 

3.1.  DATA  DESCRIPTIONS 

LADAR  data  from  two  field  tests  were  used  in  addition  to  synthetic  data.  The  following 
sections  describe  the  data  that  was  collected. 

3.1.1.  Description  of  the  1986  A.P.  Hill  Laser  Radar  Data 

The  laser  radar  range  images  described  as  the  1986  AP  Hill  Laser  Radar  Data  were  taken 
by  the  LADAR  sensor  described  in  [Rayt]  at  Fort  A.  P.  Hill,  Virginia  on  the  24th  through  the 
28th  of  March  1986.  Six  images  were  selected  from  all  the  data  taken  at  A.  P.  Hill  field  test. 
These  images,  shown  in  Figures  3. 1.1-3. 1.6,  were  selected  to  give  a  representative  sampling  of 
targets,  ranges,  clutter,  fields  of  view,  and  occlusion.  Note  that  these  figures  show  both  a  FLIR 
image  (top)  and  a  LADAR  image  (bottom),  only  the  range  images  were  used  for  these  exper¬ 
iments.  Each  LADAR  image  is  160  by  96  pixels,  with  8  bits  per  pixel.  The  LADAR  sensor 
can  collect  range  images  at  different  fields  of  view  (FOV);  Table  3.1.1  shows  the  instantaneous 
FOV’s  and  the  FOV  of  each  image  for  the  three  FOV’s  used  in  Figures  3. 1.1-3. 1.6. 

Table  3.1.1  Field  of  views  for  the  LADAR  images  in  Figures  3. 1 . 1  -3. 1 .6. 


FOV# 

Instantaneous  FOV 
Azimuth  x  Elevation  (mrad) 

FOV  (mrad) 

1 

0. 1  x  0. 1 

16  x  9.6 

2 

0.1  x  0.2 

16  x  19.2 

5 

0.2  x  0.2 

32  x  19.2 

Size  =  320  x  192 
TARGET=  U1H  HELICOPTER 
ASPECT^  SIDE  VIEW 
RANGE=  1KM 
FRAME  09951  RUN  NO.  3 
DATE  03/24/86  TIME  1335.46.069 
TEST  SITE  FT  A.P.  HILL,  VA 

THE  TEST  SITE  IS  AT  THE  DROP  ZONE  AT  A.P.  HIL  L 


File:  rvll :~nvl/range-data/324/laser/images/im03 


Size  =  160x96 

TARGET  -  UH-1 
ASPECT  -  BSR 
FJELD-OF-VIHW  -  FOV5 


Figure  3.1.1  FLIR  and  Laser  Radar  Range  image  for  target  ap  1.32403. 


File  =  rvll:~nvl/range-data/324/flir/iniages/iml  1 


Size  =  320  x  192 

TARGET=  M60A2 
ASPECT^  RIGHT  VIEW 
RANGE=  2050M 
FRAME  86247  RUN  NO.  8 
DATE  03/24/86  TIME  1616.01.692 
TEST  SITE  FT  A.P.  HILL,  VA 

THE  TEST  SITE  IS  AT  THE  DROP  ZONE  AT  A.P.  I OLL 


File:  rvll:~nvl/range-data/324/laser/images/iml  1 


Size  =  160  x  96 

TARGET  -  M60A2 
ASPECT  -  BSR 
FIELD-OF-  V 1 EW  -  FOV2 

Figure  3.1.2  FLIR  and  Laser  Radar  Range  image  for  target  apl. 32411. 


File  =  rvll:~nvl/range-data/325/flir/images/im04 


Size  =  320  x  192 


TARGET=  M60A2 
ASPECT=  RIGHT  VIEW 
RANGE=  1935M 
FRAME  27691  RUN  NO.  2 
DATE  03/25/86  TIME  0909.34.192 
TEST  SITE  FT  A.P.  HILL,  VA 

THE  TEST  SITE  IS  AT  THE  DROP  ZONE  AT  A.P.  HELL 


File:  rvl  1  :~nvl/range-data/325/laser/images/im04 


Size  =  160  x  96 

TARGET  -  M60A2 
ASPECT  -  BSR 
FIELD-OF-VIEW  -  FOV1 


Figure  3.1.3  FLIR  and  Laser  Radar  Ra^ge  image  for  target  ap  1.32504. 


File  =  rvl  1  :~nvl/range-data/326/flir/images/irr.'',3 


Size  -  320  x  192 

TARGET=  M60A2 
ASPECTS  RIGHT  VIEW 
RANGE-  1 190M 
FRAME  30200  RUN  NO.  IX 
DATE  03/26/86  TIME  0400.01. 100 
TEST  SITE  FT  A.P.  MILL,  VA 

THE  TEST  SITE  IS  AT  THE  DROP  ZONE;  AT  A.P.  HILL 
File:  rvl  1  :~nvl/range-data/326/laser/images/im33 


Si/e  160  x  96 

TARGET  -  M60A2 
ASPECT  HSR 
FIELD  OF  VIEW  -  FOVI 


Figure  3.1.4  FLIR  and  Laser  Radar  Range  image  for  target  apl  .32633. 


Size  =  320  x  192 

TARGET=  CENTER  OF  CLUSTER 
ASPECT=  RIGHT  VIEWS 
RANGE=  1540M 
FRAME  30322  RUN  NO.  7 
DATE  03/28/86  TIME  0030.00.102 
TEST  SITE  FT  A.P.  HILL,  VA 

THE  TEST  SITE  IS  AT  THE  DROP  ZONE  AT  A.P.  HILL 


File:  rvll  :~nvl/range-data/328/laser/images/im37 


Size  =  160  x  96 

TARGET  -  2-M60A2/2.5TT 
ASPECT  -  BSR 
FIELD-OF-VIEW  -  FOV1 

Figure  3.1.5  FLIR  and  Laser  Radar  Range  image  for  target  ap  1.32837. 


Size  =  320  x  192 
TARGET^  A  CLUSTER  OF  OBJECTS 
ASPECT=  RIGHT  VIEWS 
RANGE=  1540M 
FRAME  ?>353  RUN  NO.  10 
DATE  03/28/86  TIME  216:.40.270 
TEST  SITE  FT  A.P.  HILL,  VA 

THE  TEST  SITE  IS  AT  THE  DROP  ZONE  AT  A.P.  HILL 


File:  rvll  :~nvl/range-data/328/laser/images/im39 


Size  =  160  x  96 
TARGET  -  2-M60A2/2.5TT 
ASPECT  -  BSR 


Figure  3.1.6  FLIR  and  Laser  Radar  Range  image  for  target  ap  1.32839. 


3.1.2.  Description  of  the  1987  A.  P.  Hill  Laser  Radar  Data 

The  laser  radar  range  images  described  as  the  1987  AP  Hill  Laser  Radar  Data  were  col¬ 
lected  at  Fort  A.  P.  Hill,  Virginia  in  June  of  1987.  For  the  experiments  four  images  (shown  in 
Figures  3.1.7-3.1.10)  were  selected  to  give  a  variety  of  ranges,  targets  and  weather  conditions. 
The  following  sections  give  information  about  which  images  were  used  and  how  they  were 
modeled  using  the  ground  truth  information.  Detailed  information  about  the  field  test  can  be 
found  in  [NeSm87], 

3.I.2.I.  The  Types  of  Data  Collected 

During  the  June  1987  field  test  five  types  of  data  were  collected: 

FLIR, 

LADAR  AM, 

LADAR  FM, 

LADAR  return  intensity,  and 
millimeter  wave  (MMW)  radar. 

The  FLIR,  MMW  and  LADAR  images  were  collected  from  three  different  sensors  each  in  a  dif¬ 
ferent  van  and  are  therefore  not  pixel  registered.  The  three  LADAR  signals  were  collected  by 
the  same  sensor  and  are  all  pixel  registered.  Although  this  report  centers  on  the  processing  of 
the  LADAR  data,  future  work  will  include  fusing  other  sensors  such  as  FLIR  and  MMW  radar 
with  LADAR  images.  The  FLIR  images  in  Figures  3.1.7.1-3.1.10  are  shown  only  to  give  addi¬ 
tional  information  about  the  targets  in  the  field  test  and  were  not  used  in  any  of  the  processing 
in  this  section.  The  following  paragraphs  discuss  the  three  types  of  LADAR  data. 

The  LADAR  AM  data  is  like  the  data  collected  during  the  1986  A.  P.  Hill  field  test  dis¬ 
cussed  in  Section  3.1.1.  This  data  gives  very  fine  relative  range  resolution  (as  good  as  7.2  cm 
between  adjacent  range  values),  but  is  based  on  the  phase  angle  of  the  return  signal  so  that  there 
are  range  ambiguities  caused  by  phase  wrap  around.  The  AM  data  was  collect  using  8  bits  per 
pixel.  The  ambiguity  interval  is  1875  cm. 

The  LADAR  FM  image  gives  absolute  range  at  a  much  coarser  resolution  than  the  AM 
data.  It  was  collected  using  12  bits  per  pixel.  However  neither  the  AM  nor  the  FM  images 
were  distributed.  Instead,  the  LADAR  AM  and  the  LADAR  FM  images  were  combined  to 
form  a  LADARjesolved  range  image  which,  in  theory,  should  give  absolute  range  with  very 
fine  resolution.  The  LADAR  resolved  range  image  (referred  to  as  LADAR  in  the  rest  of  this 
section)  was  distributed  using  32  bits  per  pixel  where  each  pixel  value  represents  the  absolute 
distance  in  centimeters  to  the  object. 

*  In  practice,  the  noise  component  of  the  FM  signal  has  been  measured  to  be  as  much  as  9 
meters  on  a  target  at  1400  meters  and  therefore  offsets  many  of  the  advantages  of  the  AM 
component’s  fine  resolution 


c!4ta3  cl4ta3.res 


(e)  (f) 


Figure  3.1.7  Image  c!4ta3  a)  Original  32  bit  per  pixel  image  linerly  rescaled  to  8  bits  per 
pixel,  b)  Range  slice  with  black  being  everything  in  fiont  of  1400m  and  white 
being  everything  behind  1471.4  meters,  c)  FLIR  image  of  same  target,  d)  origi¬ 
nal  image  mod  1875.  e)  Model  of  ground  truth  as  seen  by  the  sensor,  e)  Model  of 
ground  truth  rotated  by  45°  to  see  the  orientation  of  the  target. 


(a) 


(b) 


clOml.mod 


(e) 


(0 


Figure  3.1.8  Image  cl 0ml  a)  Original  32  bit  per  pixel  image  linerly  rescaled  to  8  bits  per 
pixel,  b)  Range  slice  with  black  being  everything  in  front  of  1015m  and  white 
being  everything  behind  1086.4  meters,  c)  FLIR  image  of  same  targets,  d)  origi¬ 
nal  image  mod  1875.  e)  Model  of  ground  truth  as  seen  by  the  sensor,  f)  Model  of 
ground  truth  rotated  by  45°  to  see  the  relative  spacing  and  orientation  of  the  tar¬ 
gets. 


Figure  3.1.9  Image  cl7m3  a)  Original  32  bit  per  pixel  image  linerly  rescaled  to  8  bits  per 
pixel,  b)  Range  slice  with  black  being  everything  in  front  of  1650m  and  white 
being  everything  behind  1721.4  meters,  c)  FLIR  image  of  same  targets,  d)  origi¬ 
nal  image  mod  1875.  e)  Model  of  ground  truth  as  seen  by  the  sensor.  0  Model  of 
ground  truth  rotated  by  45°  to  see  the  relative  spacing  and  orientation  of  the  tar¬ 
gets. 


Figure  3.1.10  Image  cl7ml  a)  Original  32  bit  per  pixel  image  linerly  rescaled  to  8  bits  per 
pixel,  b)  Range  slice  with  black  being  everything  in  front  of  1625m  and  white 
being  everything  behind  1696.4  meters,  c)  FLIR  image  of  same  targets,  d)  origi¬ 
nal  image  mod  1875.  e)  Model  of  ground  truth  as  seen  by  the  sensor.  0  Model  of 
ground  truth  rotated  by  45°  to  see  the  relative  spacing  and  orientation  of  the  tar¬ 
gets. 
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In  the  third  type  of  LADAR  data,  the  LADAR  return  intensity ,  each  pixel  is  linearly  pro¬ 
portional  to  the  strength  of  the  return  LADAR  signal.  The  larger  the  pixel  value,  the  stronger 
the  return  signal.  Such  information  might  be  useful  as  a  confidence  value  associated  with  each 
returned  range  pixel  value.  If  the  return  signal  is  strong,  a  high  confidence  value  can  be  associ¬ 
ated  with  the  corresponding  resolved  range  value.  Unfortunately  the  intensity  data  that  was  col¬ 
lected  is  not  usable  since  there  were  some  problems  during  the  field  test  which  resulted  in  not 
a”  of  the  data  being  linear  [Phil87].  This  data  was  therefore  not  used. 

The  LADAR  sensor  is  capable  of  collecting  images  using  different  resolutions  and  dif¬ 
ferent  frame  sizes.  All  the  experiments  presented  in  this  section  used  data  with  a  resolution  of 
0.05  mrad  in  both  directions  and  a  frame  size  of  511  by  256  (horizontal  by  vertical).  (Called 
type  7C  in  [NeSm87].) 

3.I.2.2.  The  Test  Images 

The  images  in  Figures  3.1.7-3.1.10  all  have  the  same  arrangement  with  (a)  being  the  32  bit 
LADAR  image  and  (c)  being  the  8  bit  FLIR  image  of  the  same  scene.  The  following  para¬ 
graphs  present  the  general  approach  used  to  produce  the  other  images.  Table  3.1.2  is  a  sum¬ 
mary  of  the  LADAR  and  FLIR  image  header  information.  Table  3.1.3  gives  specific  comments 
on  each  of  the  test  images. 

The  upper  left  image  (a)  in  Figures  3.1.7-7.3.1.10  is  the  32  bit  per  pixel  LADAR  range 
image.  Because  of  the  large  dynamic  range,  it  was  linearly  rescaled  to  display  as  8  bits  per 
pixel.  Rescaling  tends  to  cause  the  targets  to  blend  into  the  background  and  are  often  not  visi¬ 
ble  in  the  image.  The  upper  right  image  (b)  is  a  LADAR  range  slice  of  the  same  image.  The 
range  slice  was  found  by  apply  the  following  formula  to  each  of  the  pixels: 

pixelslice  =  (pixel  -  offset)  /  scale 
if  pixel silce  <0 

then  pLxelsuce  =  0 
else  if  pixelsiice  >  255 
then  pixelsiice  =  255 

The  resulting  image  has  all  pixels  in  front  of  offset  set  to  black  and  all  those  behind  the  slice 
(those  greater  than  offset  +  scale  *  255)  set  to  white.  Figures  3.1.6-3.1.10  list  the  front  and  the 
back  of  the  range  slice  in  meters.  A  scale  of  at  least  7  should  be  used  since  the  range  data  is 
given  in  centimeters,  but  the  sensor  can  only  resolve  to  7.2  centimeters.  Since  the  lower  2  bits 
of  the  AM  relative  range  sensor  might  be  noisy,  at  scale  of  28  was  used  when  displaying 
images.  Note:  all  processing  was  done  on  the  32  bit  data.  The  rescaling  was  only  used  for 
display  purposes 

The  left  image  on  the  second  row  (c)  is  a  FLIR  image  taken  of  the  same  configuration  of 
targets  using  a  narrow  field  of  view.  The  right  image  on  the  second  row  (d)  is  computed  from 
the  LADAR  image  on  a  pixel  by  pixel  basis  using  the  following  formula: 


able  3.1.2  Summary  of  LADAR  and  FLIR  header  files  for  Figures  3.1.7-3.1.10. 


cl4ta3 


clOml 


cl7m3 


cl7ml 


FILENAME 

TARGET  TYPE 

SITE 

RANGE 

DATE 

WEATHER 

LD60977C14TA3 

M60A1  TANK 

1400  M 

1427  M 

9-JUN-1987 

HAZY  / 

OVERCAST 

LD61577C10M1 
M551  TANK, 

CJ5  JEEP, 
M113APC 

1020  M 

n/a 

15-JUN-1987 

HOT/HAZY/ 

HUMID 

LD61677C17M3 
Ml  13  APC, 

2.5  TON  TRUCK 

CJ5  JEEP 

1700  M 

1693 

16-JUN-1987 

HAZY 

LD61777C17M1 
M113  APC, 

2.5  TON  TRUCK 

1700  M 

1693 

17-JUN-1987 

CLEAR 

ladar 

TIME 

15:20:44.53 

14:43:57.42 

14:56:10.89 

09:59:03.50 

FOV 

0.8  X  1.6  DEG 

0.8  X  1.6  DEG 

0.8  X  1.6  DEG 

0.8  X  1.6  DEG 

7C 

1C 

1C 

1C 

DATA  SHIFT 

-18 

-18 

-18 

-18 

MIRROR  REFLECTION 

NO 

NO 

NO 

NO 

MODE 

IMAGE 

IMAGE 

IMAGE 

IMAGE 

RESOLVED  RANGE 

CENTIMETERS 

CENTIMETERS 

CENTIMETERS 

CENTIMETERS 

flir 

TIME 

1520.39.424 

1454.43.928 

1456.00.272 

0958.53.059 

FOV 

N 

N 

N 

N 

BRIGHTNESS 

240 

200 

190 

190 

CONTRAST 

120 

40 

70 

80 

Table  3.1.3  Comments  about  the  test  images  selected  from  the  1987  A.P.  Hill  field  test. 


Figure 

Comment 

3.1.7  a,b 

This  is  a  single  M60A1  at  1400  meters.  As  with  most 
LADAR,  the  sky  is  easily  found  because  it  appears  a 
random  return  values  on  the  upper  part  of  the  image. 

3.1.7  d 

The  target  is  on  a  wrap  around  line  since  the  dark  right 
side  must  be  farther  than  the  white  left  side.  This  is  one 
of  the  problems  with  the  relative  range  AM  data. 

However  this  does  show  that  large  geometric  features 
(such  as  the  various  planer  surfaces  of  a  target)  will 
appear  in  LADAR  data  even  at  1400  meters. 

3.1.7  e,f 

This  field  test  involved  taking  images  with  the  turret 
at  different  rotations.  The  ground  truth  data  is  unclear  as 
to  the  location  of  the  reference  to  which  the  rotation 
angles  were  measured.  Since  our  model  currently  does 
not  rotate  the  turret  apart  from  the  rest  of  the  tank, 
the  ground  truth  image  might  be  inaccurate. 

3.1.8  a,b 

There  are  three  targets  (M60A1,  Jeep,  Ml  13)  at  about 

1000  meters. 

3.1.8  d 

The  mod  1875  image  makes  the  targets  more  visible 
than  the  range  slice  image. 

3.1.8  e,f 

Although  the  locations  of  the  targets  match  the  LADAR 

Log  ground  truth,  the  relative  positions  of  the  three 
targets  appear  wrong  in  the  ground  truth  image.  The  FLIR 
image  (c)  clearly  shows  that  all  three  targets  are 
unoccluded.  It  is  possible  the  angle  to  the  sensor  was 
miss-measured. 

3.1.9  a-d 

There  are  three  targets  (Ml  13,  Jeep,  2.5  ton  truck)  taken 
at  1700  meters  on  a  humid  hazy  day.  The  targets  are 
almost  undetectable  accept  of  in  the  FLIR  image  (c)  in 
which  they  are  clearly  visible. 

3.1.10  a-d 

There  are  two  targets  (Ml  13,  2.5  ton  truck)  at  1700 
meters.  The  weather  was  clear  and  the  targets  are  more 
visible  than  the  targets  in  Figure  3.1.9. 
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pixel  am  -  pixel  mod  1875 

Since  the  resolved  range  image  is  in  centimeters  and  the  AM  ambiguity  interval  is  1875  cm,  the 
AM  portion  of  the  signal  can  be  found  by  taking  the  mod  1875  of  each  pixel.  That  is  why  these 
images  look  similar  to  the  images  in  Section  3.1.1.  Close  inspection  of  the  mod  1875  images 
reveals  that  there  is  less  noise  on  target  than  shown  in  the  range  slice  image.  We  believe  this  is 
caused  by  the  noise  present  in  both  the  AM  and  FM  components  of  the  return  signal.  The  FM 
signal  is  used  to  decide  which  absolute  range  bin  to  which  the  relative  AM  signal  corresponds. 
The  resolved  range  image,  which  is  a  combination  of  both  the  AM  and  FM  signals,  therefore 
has  the  noise  of  both  signals.  When  the  mod  1875  image  is  taken,  the  effects  of  the  FM  signal 
(and  its  noise)  are  removed  which  results  in  a  cleaner  image.  The  FM  noise  has  been  measured 
to  be  as  much  as  9  meters  at  a  range  of  1400  meters.  [Phil87]. 

The  images  on  the  bottom  row  (e  and  f)  were  constructed  using  the  ground  truth  data  from 
the  Laser  Radar  Log  [NeSm87].  This  data  gives  the  type  and  position  of  each  target  relative  to 
a  reference  marker  as  show  in  Figure  3.1.11.  The  data  for  image  cl 0ml  is: 

SV  =  132  degrees  (position  of  sensor) 

Right  target 
APC 

RR  -  52  feet  (52  feet  to  right  rear) 

CT  =  5  degrees 
ST  =  55  degrees 
Center  target 
Jeep 

LF  -  58  feet  (58  feet  to  left  front) 

CT  =  335  degrees 
ST  =  55  degrees 

Left  target 
Tank 

RF  -  1  foot  (1  foot  to  right  front) 

CT  =  355  degrees 
ST  =  150  degrees 


*  The  LADAR  LOG  reports  the  Jeep  CT  to  be  355  0  and  the  Tank  CT  to  be  335  °,  however 
these  figures  were  corrected  as  shown  during  a  phone  conversation  with  Bob  Dockery  from  the 
Center  for  Night  V ision. 


Figure  3.1.1 1  Relative  placement  of  targets  as  given  in  the  LASER  RaDAR  LOG  [NeSm87]. 
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The  following  function,  written  in  LISP,  converts  this  data  into  PADL  statements  which  posi¬ 
tions  the  targets. 

(defun  group  (van  stream  &rest  targets) 

;  van  -  angle  from  reference  to  LADAR  van 
;  stream  -  place  to  write  output  data 
;  targets  -  list  of  targets  in  scene 
(let  ((count  0))  ;  number  of  targets 

(mapc  #' (lambda  (target)  ;  convert  each  target  in  the  list  of  targe 

(format  stream  ;  write  the  following  to  stream. 

;;  The  following  is  code  written  for  PADL. 

"target's  =  ~a_~a  at  (degz=~a,  movx='a,  degz=~a) ~%" 
count 

(type  target) 

(corner  target) 

;;  The  rotation  about  the  center  must  be 
; ;  corrected  because  the  ct  below  will  turn  the 
;;  target  too. 

(-  (ct  target)  (st  target) )  ;  Rotation 

;  about  center 
;  of  target 

(*  (offset  target)  0.3048)  ;  Convert  feet  to  meters 
(-  (ct  target))) 

(setq  count  (1+  count))) 
targets) 

(format  stream  "all  =  (targetO") 

(dotimes  (i  (1-  count) ) 

(format  stream  "  un  target's"  (1+  i))) 

(format  stream  ")  at  degz=~a~%"  van))) 

(defmacro  type  (x)  ;  Type  of  target 

'  (car  ,  x) ) 

(defmacro  corner  (x)  ;  Corner  to  measure  to.  (one  of  If,  rf,  lr,  rr) 

'  (nth  1  ,x)) 

(defmacro  offset  (x)  ;  Distance  from  corner  to  reference  marker. 

'  (nth  2  ,  x)  ) 

(defmacro  ct  (x)  ;  Angle  from  reference  to  corner  of  target 

' (nth  3  , x) ) 

(defmacro  st  (x)  ;  Rotation  about  the  target's  axis. 

'  (nth  4  ,  x) ) 
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This  input: 

;;;  This  is  a  padl  description  of  the  targets  for  TAPE  DF1572 
;;;  Time:  1430 

(group  132  t 

' (ml 13  rr  52  5  55) 

' (ml 51  If  58  335  145) 

' (m60al  rf  1  355  150) ) 

produces  the  following  PADL  code  for  image  clOml . 

targetO  =  mll3__rr  at  (degz=-50,  movx=15 . 8496,  degz=-5) 
targetl  =  ml51_lf  at  (degz=190,  movx=17 . 6784,  degz=-335) 
target2  =  m60al_rf  at  (degz=205,  movx=0.3048,  degz=-355) 
all  =  (targetO  un  targetl  un  target2)  at  degz=132 

The  first  line  of  input  to  the  group  routine  states  that  the  van  with  the  LADAR  sensor  is  a  132 
degrees  from  the  reference.  The  second  line  of  input  to  the  group  routine  places  the  right  rear 
comer  of  the  Ml  13  APC  at  52  feet  and  5  degrees  from  the  reference.  The  APC  is  rotated  about 
its  axis  55  degrees.  This  translates  into  PADL  code  by  arbitrarily  identifying  the  APC  as  tar¬ 
getO.  targetO  places  mll3_rr  (the  right  rear  of  a  Ml  13)  by  rotating  it  -50  degrees  about  its 
axis.  (It  is  a  negative  amount  since  PADL  defines  rotations  opposite  from  that  used  in  the 
LADAR  Log.  It  is  only  50  degrees  not  55  since  the  second  rotation  will  take  it  the  needed  extra 
5  degrees.)  Next  the  APC  is  moved  15.8496  meters  (52  feet)  from  the  reference  and  then 
rotated  5  degrees  about  the  reference.  The  same  is  done  for  the  M151  and  the  M60A1. 

This  data  was  used  by  the  Electronic  Terrain  Board  Model  discussed  in  Section  4  to  pro¬ 
duce  the  ground  truth  images.  The  left  image  is  of  the  scene  as  viewed  by  the  sensor,  and  the 
right  image  is  the  same  view  rotated  45°  to  give  a  better  view  of  the  relative  spacing  and  orien¬ 
tation  of  the  targets.  Figure  3.1.12  is  the  targets  in  image  clOml  as  viewed  from  above.  The 
marking  show  that  the  targets  are  placed  as  described  in  the  LADAR  LOG,  however  when 
viewed  from  the  sensor  position,  the  targets  do  not  appear  to  have  the  same  relative  positions. 
We  are  unable  to  explain  the  differences  at  this  time.  One  possibility  is  that  our  selection  of 
wire  frame  models  is  limited  so  the  models  used  in  the  ground  truth  images  most  likely  do  not 
match  exactly  the  actual  targets  used  in  the  field  test.  If,  for  example,  our  M60A1  model  was 
wider  than  the  actual  target  it  would  occlude  the  Jeep  as  shown. 

3. 1.2.3.  Preprocessing  the  Data 

Close  inspection  of  the  targets  in  the  range  slices  of  the^resolved  range  data  shows  that  the 
targets,  though  distinctive  in  three  of  the  four  test  images,  all  contain  considerable  impulse 

*  Apparently  image  cl  7m3  was  taken  on  a  very  hazy  day  and  the  targets  are  not  as  visible  in  it 


Figure  3.1.12  Targets  in  image  clOml  as  viewed  from  above. 
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noise.  Images  (c)  and  (d)  in  Figures  3.1.13-3.1.16  show  that  using  a  3  by  3  and  a  5  by  5  median 
filter  on  the  images  removes  much  of  the  noise.  To  better  see  the  affects  of  the  filtering  opera¬ 
tion,  the  tank  and  the  APC  from  image  clOml  and  the  APC  and  the  2.5  ton  truck  from  image 
cl7ml  enlarged  in  Figures  3.1.17-3.1.20  Some  observations  are  in  order.  Note  the  jagged  left 
edge  of  the  tank  in  Figure  3.1.17(b)  and  the  jagged  right  edge  of  the  APC  in  Figure  3.1.18(b). 
The  jaggedness  is  caused  by  the  sensor  scanning  alternate  lines  in  different  directions,  (i.e.  If 
the  even  lines  are  scanned  from  left  to  right,  then  the  odd  lines  will  be  scanned  from  right  to 
left.)  Unfortunately  near  the  edges  of  the  image  the  alternate  lines  do  not  line  up  exactly  as 
shown  in  these  images.  This  lack  of  alignment  is  obscured  in  the  resolved  range  image,  but 
made  clear  in  the  mod  1875  image.  Apparently  this  problem  didn’t  occur  in  the  cl7ml  images. 

Median  filtering  removes  much  of  the  impulse  noise  as  shown  in  Figures  3.1.21-3.1.24 
unfortunately  it  may  also  be  removing  much  of  the  fine  structure  information  on  the  target.  The 
great  dynamic  range  of  32  bit  data  presents  some  problems  not  present  in  8  bit  data.  In  this  data 
a  noise  spike  can  cause  a  pixel  value  to  appear  a  kilometer  or  two  away  from  the  target.  Such 
noise  must  be  removed,  but  at  the  same  time  the  fine  “10’s  of  centimeters”  resolution  must  be 
preserved  since  it  might  contain  structural  information  about  the  target. 

The  mod  1875  images  provide  the  first  clue  as  to  how  the  process  the  32  bit  data.  The  tar¬ 
gets  are  often  more  visible  in  these  images  that  in  the  original  resolved  range  images.  The 
impulse  noise  on  target  appears  to  be  much  lower  which  means  the  FM  data  must  have  con¬ 
tained  noise  which  caused  the  pixel  value  to  be  put  in  the  wrong  range  bin.  Our  median  based 
range  bin  corrector  (MBRBC)  attempts  to  fix  this  problem  by  looking  the  the  median  value  of 
all  the  pixels  in  a  window  around  a  given  pixel.  If  the  given  pixel  is  more  than  one  ambiguity 
interval  (1875  cm)  away  from  the  median,  an  integer  number  of  ambiguity  values  will  be  added 
(or  subtracted)  from  the  pixel  value  until  it  is  within  an  ambiguity  interval  of  the  median.  If  the 
image  has  no  range  bins  errors,  the  pixel  value  will  be  within  an  ambiguity  interval  to  the 
median  and  with  therefore  not  be  changed.  If  the  pixel  is  misplaced  by  a  range  bin  error  it  will 
be  placed  back  into  the  correct  bin  (assuming  the  median  of  the  neighbors  is  within  the  correct 
bin).  Images  (d)  and  (f>  in  Figures  3.1.13-3.1.16  show  the  test  images  after  using  the  MBRBC 
filter.  These  images  show  that  there  is  no  great  improvement  in  the  overall  appearance,  how¬ 
ever,  Figures  3.1.25-3.1.28  show  the  close  ups  of  the  four  targets  used  before  and  their  MBRBC 
images.  Here  it  is  clear  that  the  standard  median  filter  tends  to  remove  what  might  be  structural 
details  and  smooth  out  the  image.  The  MBRBC  images  have  the  same  noise  spikes  removed, 
but  still  have  the  information  that  might  allow  fine  structural  details  to  be  extracted. 

3.I.2.4.  Comments  on  the  Quality  of  the  Data 

The  LADAR  sensor  requires  careful  tuning  before  it  will  produce  good  images.  The  qual¬ 
ity  of  the  data  taken  during  the  1987  field  test  appears  to  vary  greatly  from  one  set  of  images  to 
another.  An  example  is  images  cl7ml  and  c!7m3  which  were  taken  at  the  same  distance  and 
field  of  view,  but  the  targets  in  cl7m3  are  much  more  visible  that  those  in  cl7ml .  This  could 


Figure  3.1.13  Image  c!4ta3  a)  Range  slice  as  in  Figure  3.1.7,  b)  mod  1875,  c)  Median  using  3 
by  3  window,  d)  Smart  median  using  a  3  by  3  window,  e)  Median  using  a  5  by  5 
window,  f)  Smart  median  using  a  5  by  5  window. 


Figure  3.1.14  Image  clOml  a)  Range  slice  as  in  Figure  3.1.8,  b)  mod  1875,  c)  Median  using  3 
by  3  window,  d)  Smart  median  using  a  3  by  3  window,  e)  Median  using  a  5  by  5 
window,  0  Smart  median  using  a  5  by  5  window. 


Figure  3.1.15  Image  c!7m3  a)  Range  slice  as  in  Figure  3.1.y.  b)  mod  1875,  c)  Median  using  3 
by  3  window,  d)  Smart  median  using  a  3  by  3  window,  e)  Median  using  a  5  by  5 
window,  f)  Smart  median  using  a  5  by  5  window. 


(e)  (0 


Figure  3.1.16  Image  cl7ml  a)  Range  slice  as  in  Figure  3.1.10,  b)  mod  1875,  c)  Median  using  3 
by  3  window,  d)  Smart  median  using  a  3  by  3  window,  e)  Median  using  a  5  by  5 
window,  0  Smart  median  using  a  5  by  5  window. 


Figure  3.1.17  Tank  from  image  clOml .  The  upper  left  comer  of  this  image  is  pixel  100,  1 10 
from  image  clOml  a)  Rescaled  image,  b)  mod  1875  image. 


(b) 


Figure  3.1.18  APC  from  image  clOml .  The  upper  left  comer  of  this  image  is  pixel  300,  110 
from  image  clOmJ .  a)  Rescaled  image,  b)  mod  1875  image. 


Figure  3.1.19  APC  from  image  c!7ml .  The  upper  left  comer  of  this  image  is  pixel  144,115 
from  image  c!7ml .  a)  Rescaled  image,  b)  mod  1875  image. 


Figure  3.1.20  2.5  ton  truck  from  image  cl7ml .  The  upper  left  corner  of  this  image  is  pixel 
251,  120  from  image  c!7ml .  a)  Rescaled  image,  b)  mod  1875  image. 


(b) 


Figure  3.1.21  Median  filtered  tank  from  image  clOml .  The  upper  left  corner  of  this  image  is 
pixel  100,  1 10  from  image  clOml  a)  3  by  3  window,  b)  5  by  5  window. 


Figure  3.1.22  Median  filtered  AFC  from  image  cIGml .  The  upper  left  corner  of  this  image  is 
pixel  300,  1 10  from  image  clOml .  a)  3  by  3  window,  b)  5  by  5  window. 


Figure  3.1.23  Median  filtered  APC  from  image  cl7ml .  The  upper  left  corner  of  this  image  is 
pixel  144,1 15  from  image  cl7ml .  a)  3  by  3  window,  b)  5  by  5  window. 


(b) 


Figure  3.1.24  Median  filtered  2.5  tor.  truck  from  image  cl?,nl  .  The  upper  left  comer  of  this 
image  is  pixel  251,  120  from  image  cl7ml .  a)  3  by  3  window,  b)  5  by  5  win¬ 
dow. 


(b) 

Figure  3.  '.25  MBRBC  filtered  tank  from  image  clOml .  The  upper  left  comer  of  this  image  is 
pixel  100,  1 10  from  image  clOml  a)  3  by  3  window,  b)  5  by  5  window. 


(b) 

Figure  3.1.26  MBRBC  filtered  AFC  from  image  clOml .  The  upper  left  comer  of  this  image  is 
pixel  300,  1 10  from  image  clOml .  a)  3  by  3  window,  b)  5  by  5  window. 


(b) 


Figure  3.1.27  MBRBC  filtered  APC  from  image  cl7ml .  The  upper  left  corner  of  this  image 
pixel  144,1 15  from  image  cl7ml .  a)  3  by  3  window,  b)  5  by  5  window. 


Figure  3.1.28  MBRBC  filtered  2.5  ton  truck  from  image  cl7m!  .  The  upper  left  comer  of  this 
image  is  pixel  251,  120  from  image  cllml .  a)  3  by  3  window,  b)  5  by  5  win¬ 
dow. 
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be  caused  by  the  weather  being  clear  the  day  c!7ml  was  taken  and  hazy  the  day  cl7m3  was 
taken,  or  it  could  be  caused  by  improper  tuning.  Such  variation  in  results,  whether  caused  by 
tuning  or  the  weather,  make  it  difficult  to  estimate  the  maximum  distance  such  a  sensor  can  be 
used. 

Another  problem  with  the  data  is  the  misalignment  between  the  even  and  odd  scan  lines. 
A  simple  solution  is  to  only  use  every  other  scan  line,  but  such  an  approach  then  throws  out  half 
of  the  data. 

3. 1.2.5.  Future  Work 

Since  this  is  the  first  report  in  which  the  1987  A.  P.  Hill  data  was  examined,  there  are 
many  areas  that  need  more  work.  The  following  paragraphs  details  on  the  following  ideas: 

1.  Create  ground  truth  images  using  the  Electronic  Terrain  Board  Model. 

2.  Devise  a  method  to  measure  the  effectiveness  of  the  processing  filters. 

The  Electronic  Terrain  Board  Model  need  to  be  improved  so  that  it  can  produce  images, 
based  in  the  ground  truth  information  given  for  each  image,  which  are  accurate  enough  to  be 
considered  ground  truth  images .  This  model  needs  to  include  both  the  targets  and  the  terrain 
We  expect  these  changes  to  be  brought  about  by  switching  to  the  BRL-CAD  modeler  which 
will  allow  the  very  detailed  BRL  models  to  be  used. 

We  need  to  develop  a  method  of  measuring  the  effectiveness  of  the  median  and  MBRBC 
filters.  Visually  speaking,  the  median  filter  appears  to  remove  some  fine  structure  information, 
however  we  have  no  way  of  measuring  how  much  information  is  lost.  One  good  method  would 
be  to 

1)  use  the  ground  truth  data  to  know  location  and  orientation  of  the  planer  regions, 

2)  filter  the  LADAR  data  with  the  preprocessor  to  be  tested, 

3)  extract  the  geometric  features  as  presented  in  Section  3.3. 1 , 

4)  compare  the  extracted  features  to  those  in  the  ground  truth  data. 

Such  a  procedure  should  give  a  very  accurate  measure  to  a  filter’s  effectiveness  provided  the 
models  used  in  the  ground  truth  data  are  close  enough  to  the  targets  which  actually  appeared  in 
the  image. 

3.2.  EVALUATION  OF  LADAR  IMAGES 

3.2,1.  Preliminary  Results  of  Measuring  Classifiability  vs.  Range  in  LADAR 

One  of  the  unanswered  questions  about  range  data  is:  “At  what  range  can  targets  be  accu¬ 
rately  classified?”  Of  course  the  answer  depends  on  may  factors  including  the  sensor,  the 
weather,  and  the  target.  In  the  work  presented  here  an  attempt  has  been  made  to  find  an  upper 
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bound  on  the  classification  accuracy  by  measuring  the  classifiably  of  noiseless  synthetic  data. 
The  only  noise  present  in  such  data  is  the  spatial  quantization  noise  caused  by  having  fewer  pix¬ 
els  on  target  as  the  target  is  farther  away.  The  following  section  presents  the  results  of  the 
study. 

3.2.1.1.  The  Synthetic  Images 

The  synthetic  images  were  generated  as  shown  in  Section  4.1.  The  four  targets  used  in 
these  experiments  are  the  M60A1,  Ml  13,  BMP,  and  BRDM2.  Each  target  was  viewed  from 
zero  elevation  (slant  angle  of  zero).  The  targets  were  rotated  from  zero  to  360  degrees  and  an 
image  was  taken  every  size  degrees  for  a  total  of  sixty  images  per  class.  Section  4.1.3  discussed 
how  to  add  noise  to  an  image,  however  for  these  experiments,  no  noise  was  added. 

3.2.1.2.  The  Segmentation  and  Features 

Since  synthetic  data  is  used,  the  data  is  perfectly  segmented,  therefore  no  segmenter  is 
needed.  The  same  two  feature  sets  were  used  as  in  Section  3. 3. 4.2. 

3.2.1.3.  Classification  Results 

Tables  3.2. 1  and  3.2.2  show  the  results  of  the  experiment  for  each  feature  set. 

Table  3.2.1  Classification  accuracy  of  noiseless  synthetic  LADAR  data.  60  each  of  M60A1, 
Ml  13,  BMP,  MRDM2s  using  shape  features. 


As  expected  (due  to  lack  of  sensor  noise)  the  classification  accuracies  were  very  high.  In  gen¬ 
eral  the  accuracies  decreased  as  the  distance  to  the  target  increased.  Unexpected  results  appear 
at  5km  using  the  shape  feature  set  and  at  4km  using  the  Beta  feature  set.  We  can  offer  no  expla¬ 
nation  for  this  happening  at  this  time. 

3.2.I.4.  Conclusion  and  Future  Work 

Since  noiseless  data  is  being  used  conclusion  about  real  data  cannot  be  made.  (Other  than 
the  obvious  “the  classification  accuracy  will  decrease  with  distance.”) 
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Table  3.2.2  Classification  accuracy  of  noiseless  synthetic  LADAR  data.  60  each  of  M60A1, 
Ml  13,  BMP,  MRDM2s  using  Beta  features. 


Beta  features  (Table  3.3.1 1) 

Distance 

Lower  Bound 

Upper  Bound 

0.0 

0.0 

0.83 

0.83 

1 

2.08 

2.50 

1.67 

4.17 

4km 

1.25 

3.33 

5km 

3.33 

5.00 

Future  work  in  the  area  will  start  with  using  synthetic  data  with  noise  added.  Section  4.1.3 
gives  details  on  how  such  data  is  being  created. 

3.2.2.  Further  Experimentation  to  Determine  Ciassifiability  vs.  Range 

In  continuing  the  study  of  ciassifiability  vs.  range,  the  classification  experiment  in  Section 
3.2.1  was  expanded  by  generating  synthetic  targets  and  artificially  degrading  them  with  noise  as 
in  Section  4.1.3.  The  same  orientation  angle  is  still  used,  but  the  samples  now  include  targets 
from  four  target  classes  at  ranges  of  0.5,  1.0,  2.0,  3.0, 4.0,  and  5.0  km. 

In  addition,  each  synthetic  LADAR  image  from  a  particular  target  class  and  range  was  cor¬ 
rupted  with  nine  different  noise  characteristics,  bringing  the  total  sample  size  to  216.  The  noise 
characteristics  were  created  by  combining  various  levels  of  "drop  out"  and  Gaussian  noise. 
Realistic  levels  of  these  noise  types  were  determined  using  the  table  of  results  presented  in  Sec¬ 
tion  4.1.3.  From  this  data,  overall  minimums,  maximums,  and  averages  were  determined  for 
each  noise  type.  For  convenience  we  define  three  noise  levels  (  low,  medium,  and  high). 
These  levels  are  shown  in  Table  3.2.3. 

Table  3.2.3  Noise  levels  used  in  ciassifiability  experiment. 


Noise  Parameters 

_ 

Level 

Drop  Out 
Probability 

Gaussian  Standard 

Deviation 

low 

0.005 

7.00 

medium 

0.043 

high 

0.102 

■■ 
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3.2.2. 1.  Segmentation 

The  planar  patch  fitting  algorithm  of  Section  3.3.3. 1  was  again  used  to  segment  the  sam¬ 
ples.  Unfortunately,  the  automatic  threshold  selection  of  this  algorithm  worked  very  poorly  on 
these  images,  and  the  segmentation  became  an  iterative  process  of  choosing  segmentation  thres¬ 
holds  and  observing  results.  The  best  segmentation  of  this  step  was  then  further  processed  with 
the  aid  of  connected  component  labeling  to  remove  holes  in  the  silhouette  as  well  as  false  target 
regions  outside  the  silhouette.  Figures  3.2.1  and  3.2.2  show  some  examples  of  synthetic 
imagery  and  the  corresponding  final  segmentations. 

3.2.2.2.  Classification  Results 

Although  separate  experiments  using  both  the  shape  and  moment  feature  sets  were  run  as 
before,  problems  which  will  require  further  investigation  were  encountered  with  the  moment 
feature  set  experiment.  The  covariance  matrices  computed  for  these  experiments  were  nearly 
singular,  thus  making  the  classification  results  useless. 

Three  experiments  using  the  shape  feature  set  produced  valid  results,  however,  and  these 
appear  in  Tables  3.2.4,  3.2.5,  and  3.2.6.  Table  3.2.4  directly  addresses  the  question  of 
classifiability  vs.  range,  while  Table  3.2.5  examines  the  effects  of  noise  on  classifiability.  Table 
3.2.6  conveys  the  results  of  an  overall  classification  experiment  in  which  samples  where  not 
separated  according  to  range  or  noise. 

Table  3.2.4  Classifiability  vs.  Range  study. 


Classifiability  vs.  Range 

Shape  feature  set 

Range  (km) 

Error  (Lower  bound) 

0.5 

0.0  %  (0/36) 

1.0 

2.8  %  (1/36) 

2.0 

0.0  %  (0/36) 

3.0 

19.4  %  (7/36) 

4.0 

2.8  %  (1/36) 

5.0 

0.0  %  (0/36) 

3.2.2.3.  Conclusion 

Originally  the  experiment  presented  here  was  to  be  performed  on  a  much  larger  scale  with 
60  orientations  for  each  target  class.  At  this  point,  however  the  processing  steps  involved  are 
not  sufficiently  refined  and  automated  to  make  such  an  experiment  possible  for  this  reportt. 


t  Currently  such  an  experiment  would  require  an  estimated  month  of  continuous  computer 


Synthetic  image,  Noise:  L/L 


Segmented  image.  Noise:  L/L 


Segmented  image.  Noise:  H/H 


Figure  3.2.1  Segmentation  examples.  Synthetic  image  of  Ml  13  at  0.5  km. 


Synthetic  image.  Noise:  L/L 


Segmented  image,  Noise:  L/L 


Synthetic  image.  Noise:  H/H 


Segmented  image,  Noise:  H/H 


F-igure  3.2.2  Segmentation  examples.  Synthetic  image  of  Ml  13  at  5.0  km. 
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Table  3.2.5  Classifiability  vs.  Noise  study.  Noise  characteristics  are  specified  as:  Drop  Out 
Probability/Gaussian  Standard  Deviation.  Letters  indicate  the  noise  level,  i.e.  L_ow,  M_edium, 
or  H  Jgh. 


Classifiability  vs.  Noise 

Shape  feature  set 

Noise 

Error  (Lower  bound) 

L/L 

4.2%  (1/24) 

L/M 

4.2  %  (1/24) 

L/H 

4.2  %  (1/24) 

M/L 

4.2  %  (1/24) 

M/M 

0.0  %  (0/24) 

M/H 

8.3  %  (2/24) 

H/L 

4.2  %  (1/24) 

H/M 

4.2  %  (1/24) 

H/H 

4.2  %  (1/24) 

Table  3.2.6  Overall  classification  experiment.  Classification  experiment  using  all  samples,  i.e. 
no  separation  by  range  or  noise. 


Overall  Classification  Experiment 
Shape  feature  set 

Lower  bound 

Upper  bound 

4.2%  (9/216) 

13.4%  (29/216) 

Because  of  this,  the  sample  set  here  is  still  too  small  to  be  of  great  help.  In  each  range  set,  for 
example,  there  were  only  36  samples.  Each  noise  set  only  had  24  samples.  This  was  the  reason 
for  not  estimating  the  upper  bound  error  in  these  two  experiments.  This  estimate  requires  one 
sample  to  be  left  out,  and  with  so  few  samples  to  begin  with,  the  results  would  be  questionable. 

It  is  strange  that  no  trend  is  evident  in  either  of  these  experiments.  This  could  be  due  to 
the  small  sample  sizes.  Another  interesting  point  is  the  error  jump  in  the  range  experiment  for 
the  3  km  case.  The  reason  for  this  is  still  unknown,  but  the  small  sample  size  could  have  exag¬ 
gerated  the  problem. 


time  alone,  not  i.ic’uding  the  user  interaction  needed  to  be  sure  all  thresholds  were  set  properly. 
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3.2.2.4.  Future  Work 

The  next  step  in  this  study  of  classifiability  vs.  range  is  the  addition  of  orientation.  Ideally 
the  experiment  will  be  an  expansion  of  the  work  presented  in  Section  3.2.1  where  synthetic 
images  of  the  same  four  target  classes  were  used.  In  this  study,  60  different  orientations  of  each 
sample  were  included.  To  run  the  above  experiment  with  this  many  orientations  would  require 
processing  12,960  images.  Obviously  this  experiment  may  need  to  be  trimmed  down  to  a  more 
reasonable  level,  but  in  any  case,  some  aspects  of  the  classification  process  (such  as  segmenta¬ 
tion)  must  be  improved  before  such  a  large  scale  study  can  be  attempted. 

3.2.3.  Optimal  Sampling  of  the  Feature  Space 

With  the  availability  of  synthetic  imagery,  endless  data  can  be  generated  for  a  given  exper¬ 
iment.  Unfortunately  experiments  run  on  endless  data  complete  after  endless  time.  The  syn¬ 
thetic  test  data  must  be  carefully  chosen  to  represent  the  types  of  images  which  will  be  viewed 
in  real  situations.  This  section  presents  two  approaches  to  intelligently  reducing  the  amount  of 
data  used  in  a  classification  experiment.  The  first  approach,  presented  in  Sections  3.2.3. 1  - 
3. 2. 3. 6,  uses  a  new  method  based  on  the  Reduced  Parzen  Classifier.  The  second  approach  is 
classical  clustering  which  is  surveyed  in  Section  3.2.3. 7. 

3.2.3. 1.  Motivation 

This  section  pursues  the  design  of  a  statistical  classifier  for  use  in  the  classifiability  of 
LADAR  silhouette  data  vs.  range  experiment  of  [KaYo88],  The  thrust  of  the  experiment  is  to 
measure  the  decreasing  classifiability  of  noisy  synthetic  LADAR  silhouette  data  with  increasing 
distance  to  the  target.  The  initial  approach  was  to  train  the  classifier  using  60  images  of  each 
target.  An  image  was  generated  for  every  six  degrees  of  target  rotation.  It  was  noted  that  this 
uniform  sampling  of  rotation  space  was  not  the  optimal  approach  to  solving  the  problem.  In  this 
section  we  show  a  method  of  selecting  a  subset  of  available  data  to  use  as  design  samples  for 
the  classifier. 

Originally  it  was  hoped  that  this  work  would  lead  to  a  drastic  reduction  in  the  amount  of 
processing  necessary  to  train  the  statistical  classifier  by  identifying  a  set  of  "key  orientations” 
for  the  design  samples  that  could  be  used  for  all  ranges  and  noise  levels.  The  results  presented 
here  indicate  that  this  reduction  in  processing  will  not  be  possible,  but  it  does  appear  that  it  may 
be  possible  to  use  fewer  than  60  design  samples  in  training  the  classifier.  This  would  reduce  the 
computation  necessary  for  classification.  In  any  case,  the  results  of  this  work  are  interesting  and 
have  proven  useful  in  understanding  the  estimation  of  class  densities  for  purposes  of 
classification. 
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3.2.3.2.  The  Feature  Space 

This  section  is  merely  an  effort  to  better  understand  the  problem  at  hand  by  drawing  atten¬ 
tion  away  from  the  target  orientation  question  and  focusing  it  on  the  feature  space.  It  is  the 
result  of  much  thought  toward  better  understanding  the  problem  of  density  estimation  and, 
although  not  actually  implemented,  should  serve  as  an  instructive  prelude  to  the  algorithm 
presented  in  the  next  section. 

Since  our  statistical  classification  experiments  are  based  on  target  class  density  estimation, 
it  seems  reasonable  to  pose  the  problem  as  follows: 

Given  that  r*  samples  are  sufficient  to  represent  a  class  density  p(x),  and  a  "good"  density 

estimate  ps(x)  is  available,  what  r  sample  subset  of  the  N  available  samples  gives  the  best 

density  estimate  pr(x)? 

Breaking  the  problem  down  a  bit  further:  If  j  samples  (j<r)  have  been  chosen  to  estimate 
the  class  density  p  (. x ),  which  of  the  N-j  remaining  samples  should  be  chosen  next?  Is  there  a 
portion  of  the  class  density  that  is  under  represented?  Of  the  N-j  remaining  samples,  suppose 
we  choose  the  least  likely  sample  given  the  current  class  density  estimate,  pj{x).  In  other 
words,  choose  the  sample,  S,  from  the  N-j  remaining  samples  such  that  pj(S)  is  a  minimum. 
This  should  force  the  r  representative  samples  to  be  spread  well  over  the  N  available  samples, 
and  could  be  accomplished  through  a  simple  application  of  Parzen  density  estimation  tech¬ 
niques. 

Although  this  procedure  should  space  the  r  representative  samples  well  across  the  class 
density’s  extent,  nothing  has  been  done  to  indicate  sample  populations  in  the  regions  surround¬ 
ing  the  r  representative  samples.  An  obvious  approach  to  this  problem  is  to  assign  a  weight  to 
each  representative  sample  based  on  how  many  of  the  N-r  remaining  samples  are  closer  to  it 
than  any  other  representative  sample.  The  commonly  used  distance  metric,  (X-Y)T  1  (X-Y), 
could  be  used  here.  The  sample  covariance,  I,  is  calculated  using  the  N  available  samples. 

3.2.3.3.  The  Reduced  Parzen  Classifier  [FuHa] 

Another  approach  to  the  problem  of  selecting  r  samples  is  to  determine  which  of  several  r 
element  subsets  produce  the  best  density  estimate,  pr(x).  The  Reduced  Parzen  Classifier  (RPC) 
[FuHa]  algorithm  uses  this  approach,  and  is  based  on  computing  a  figure  of  merit  by  which  den¬ 
sity  estimates  are  compared.  This  figure  of  merit  is  actually  an  entropy  expression. 

- 

f  In  — r -  pN{x)dx  <0 

Pn(x) 

V.  J 

The  inequality,  ln(a)<a~  1,  may  be  used  to  show  the  b^und  on  this  expression.  Strict  equality 
*  How  to  select  the  size  of  the  reduced  sample  set,  r,  is  briefly  addressed  in  Section  3. 2.3.6. 


3-47 


holds  iff  pr(x)=pH(x).  This  expectation  is  replaced  by  a  sample  mean  calculation  in  the  actual 
algorithm. 

J=~  Z  On  \Pr(Xi)]-ln  \pN(Xi)]) 

As  in  the  last  section,  density  estimates  are  computed  using  Parzen  estimation  techniques. 
If  the  computational  burden  were  not  an  issue  this  criterion  could  be  maximized  over  all  possi¬ 
ble  r  element  subsets  of  the  N  available  samples  to  select  the  optimum  reduced  sample  set.  To 
make  the  computation  expense  reasonable,  however,  a  simplification  is  made  in  the  algorithm: 

(1)  Arbitrarily  select  an  initial  assignment  of  r  samples  from  the  N  sample  data  set.  Call  the  r 
sample  set  STORE,  and  the  remaining  N-r  samples  TEST. 

(2)  For  each  element,  X,,  in  TEST,  compute  the  change  in  J  that  results  if  the  sample  is 
transferred  to  STORE.  A J ,  (Xt)=Jr+\ (Xt)-Jr. 

(3)  Pick  the  element,  Xt,  corresponding  to  the  largest  A J \  (and  call  it  X*  ). 

(4)  For  each  element,  Xs,  in  STORE,  compute  the  change  in  J  that  results  if  the  sample  is 
transferred  to  TEST.  A 1 2 (Xs )=Jr (Xs y~Jr+\ • 

(5)  Find  the  element,  Xs,  corresponding  to  the  largest  A /2  (  and  call  it  X*s). 

(6)  The  change  of  J  due  to  these  two  operations  is  A/=A/1+A/2.  If  X*  exists  such  that  A/ >0, 
transfer  X*  to  TEST,  transfer  X*  to  STORE,  and  go  to  step  2. 

(7)  Otherwise,  find  the  element,  X,,  corresponding  to  the  next  largest  A J  j  (and  call  it  X*). 

(8)  If  X*  exists,  go  to  step  4. 

(9)  Otherwise,  stop. 

As  pointed  out  in  [FuHa],  this  algorithm  does  not  necessarily  select  the  optimal  reduced 
sample  set,  but  experimentation  has  shown  the  algorithm  to  work  well. 

An  important  point  to  note  is  that,  just  as  in  the  work  of  the  last  section,  this  algorithm 
depends  upon  a  "good"  density  estimate,  p^(x).  It  is  first  used  in  the  selection  of  the  reduced 
sample  set,  and  then  in  designing  the  Bayes  classifier  where  a  good  sample  covariance  estimate 
is  needed. 

This  is  the  primary  reason  that  the  initial  hopes  of  this  study  are  not  reali7'-’  *  The  results 
of  the  next  section  show  that  class  densities  are  so  drastically  chan^  m  features  are 
extracted  from  noisy  images  that  it  makes  no  sense  to  estimate  them  with  the  same  orientation 
samples  chosen  for  a  noiseless  case.  In  other  words,  there  does  not  appear  to  be  a  set  of  "key 
orientations"  for  designing  a  general  classifier. 
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3.2.3.4.  Experimental  Results 

The  Reduced  Parzen  Classifier  of  Section  3.2.3. 3  was  implemented  and  the  experiments  of 
this  section  are  meant  to  increase  our  understanding  of  the  density  estimation  problem  so  that 
we  may  intelligently  approach  the  classification  vs.  range  experiment. 

The  primary  questions  addressed  are: 

(1)  Given  60  images  generated  at  six  degree  orientation  intervals  for  a  single  target  model  at 
0.5  km.,  which  ten  orientations  best  estimate  the  target’s  class  density? 

(2)  Are  the  same  ten  orientations  chosen  for  a  different  target? 

(3)  How  does  range  affect  which  samples  arc  chosen  for  the  reduced  sample  set? 

(4)  How  does  noise  affect  the  selection? 

(5)  How  does  the  feature  set  used  affect  the  selection? 

(6)  How  does  classification  accuracy  vary  with  the  size  of  the  reduced  sample  set? 

The  following  experiments  were  performed  to  answer  these  questions: 

(1)  Silhouettes  extracted  from  60  noiseless,  0.5  km.,  BRDM2  synthetic  images.  Area  and 
H2/Area  features  computed  for  these  silhouettes.  RPC  algorithm  used  to  select  ten 
optimal  samples  for  density  estimation.  Repeated  for  M60A1  target  class. 

(2)  Experiment  (1)  repeated  for  5.0  km.  data. 

(3)  Experiment  (1)  repeated  with  high  noise  (PDO=0.102,  GSD=17.2)*  added  to  synthetic 
imagery. 

(4)  Experiment  (3)  repeated  with  Area  and  Rectangularity  features. 

(5)  Experiments  (3)  and  (4)  repeated  using  reduced  sample  set  sizes  of  1,3,5,7,30  and  50  to 
train  the  classifier.  Classification  accuracies  computed  for  each  case. 

All  of  these  experiments  were  performed  on  two  target  classes  in  a  two  dimensional 
feature  space,  but  may  be  generalized  to  M  classes  in  an  N  dimensional  feature  space. 

NOTE: 

The  feature  samples  of  the  following  scatter  plots  have  been  transformed  through  a  simul¬ 
taneous  diagonalization  process  which  diagonalizes  both  covariance  matrices  leaving  one 
equal  to  the  identity  matrix.  Thus,  the  samples  no  longer  possess  the  physical  meaning 
indicated  by  the  axes  labels,  but  were  derived  from  these  features . 


*  As  defined  in  [KaYo87],  PDO=Probability  of  Drop  Out,  GSD=Gaussian  Standard  Deviation. 
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3.2.3.4.1.  Reduced  Samples  Sets  for  Two  Noiseless  Target  Classes 

Figures  3.2.3(b)  and  3.2.4  show  the  ten  orientations  chosen  to  estimate  the  BRDM2  class 
density  for  a  0.5  km.  sample  set.  The  chosen  samples  fall  very  near  one  another  in  the  feature 
space.  Some,  in  fact,  lie  directly  on  top  of  one  another  since  their  orientations  differ  by  exactly 
180  degrees.  These  results  seem  to  contradict  the  reasoning  of  Section  3. 2. 3. 2  where  it  was  sug¬ 
gested  that  the  representative  samples  be  spread  well  across  the  class  density.  It  may  be,  how¬ 
ever,  that  the  weighting  procedure  of  the  same  section  would  produce  results  similar  to  those 
seen  here  by  "zeroing-out"  fringe  samples. 

Figures  3.2.3(b)  and  3.2.5  show  the  ten  orientations  chosen  to  estimate  the  M60A 1  class 
density  for  a  0.5  km.  sample  set.  Since  the  silhouette  characteristics  for  the  two  classes  are  very 
different  (at  least  we  hope  so  for  classification  purposes),  it  is  entirely  reasonable  that  different 
reduced  sample  sets  were  chosen  to  represent  the  two  classes. 

3.2.3.4.2.  Effects  of  Range 

Does  it  make  sense  that  the  orientations  chosen  for  the  reduced  sample  set  will  be  different 
at  5.0  km.  than  they  were  at  0.5  km?  This  probably  depends  on  the  target  since  certain  target 
characteristics  may  be  distinguishable  at  close  range,  but  disappear  at  long  range  with  fewer 
pixels  on  target.  For  example,  the  main  gun  of  a  tank  may  be  invisible  at  5.0  km.,  and  this 
could  significantly  affect  which  silhouette  angles  best  represent  this  target  class. 

Figure  3.2.6  shows  the  sets  of  ten  samples  chosen  by  the  RPC  algorithm  for  the  two  targets 
at  5.0  km.  Although  the  orientations  of  the  0.5  km.  reduced  sample  sets  are  different  from  those 
of  the  5.0  km.,  Figure  3.2.7  shows  that  the  feature  samples  fall  in  similar  regions  of  the  5.0  km. 
scatter  plots  and  produce  equal  resubstitution  errors  of  0%.* 

3.2.3.4.3.  Effects  of  Noise 

Figure  3.2.8  shows  that  classification  becomes  much  less  accurate  when  noise  is  added  to 
the  target  images  making  silhouette  segmentation,  and  therefore  feature  extraction,  much  more 
difficult.  It  is  also  evident  that  the  feature  samples  are  not  just  perturbed  by  the  noise,  but  the 
shape  of  the  class  density  is  drastically  affected.  Will  the  same  orientations  chosen  to  estimate 
noiseless  class  densities  be  most  useful  in  representing  noisy  densities?  Figure  3.2.9  clearly 
indicates  that  this  is  not  the  case. 

The  orientations  of  the  reduced  sample  sets  chosen  using  the  noisy  class  densities  are 
entirely  different  from  those  ‘hat  were  chosen  from  noiseless  images.  The  resubstitution  error  is 
well  over  three  times  worse  when  the  orientations  chosen  from  the  noiseless  densities  are  used 
to  design  the  reduced  Parzen  classifier  for  noisy  samples. 

*  Resubstitution  error  is  the  error  computed  when  the  classifier  is  tested  using  samples  of  the 
design  set  [Fu72], 
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Figure  3.2.3  (a)  Sixty  samples  of  eacn  target  class.  Resubstitution  error=0% 

(b)  Ten  samples  of  each  reduced  sample  set.  Dots  denote  samples  NOT 
chosen  for  the  reduced  sample  set.  Resubstitution  error=0%. 
Corresponding  target  orientations: 

BRDM2:  30,36,42,138,144,150,210,216,324,330  degrees 
M60A1:  42,48,84, 1 32, 1 38,228,234,264,3 1 2,3 1 8  degrees 
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Figure  3.2.4  BRDM2  silhouettes  chosen  for  0.5  km.  reduced  sample  set. 
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Figure  3.2.5  M60A1  silhouettes  chosen  for  0.5  km.  reduced  sample  set. 
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Two  Class  Scatter  (5.0  km) 


Figure  3.2.6  (a)  Sixty  samples  of  each  target  class.  Resubstitution  error=0%. 

(b)  Ten  samples  of  each  reduced  sample  set.  Dots  denote  samples  NOT 
cl  osen  for  the  reduced  sample  set.  Resubstitution  error=0%. 
Corresponding  target  orientations: 

BRDM2:  30,36,42,132,138,144,204,210,318,324  degrees 
M60A 1 :  78,96, 1 04, 1 32,228,234,306,3 1 2,3 1 8,324  degrees 
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Figure  3.2.7  (a)  Reduced  sample  sets  chosen  using  c  0  km.  data.  Resubstitution  error=0% 
(b)  Reduced  sample  sets  chosen  using  0.5  km.  data.  Resubstitution  error=0% 
(Dots  denote  samples  NOT  chosen  for  the  reduced  sample  sets.) 
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Figure  3.2.8  (a)  Sixty  samples  of  each  target  class.  Resubstitution  error=3.3% 

(b)  Ten  samples  of  each  reduced  sample  set.  Dots  denote  samples  NOT  chosen 
for  the  reduced  sample  set.  Resubstitution  error=3.3% 

Corresponding  target  orientations: 

BRDM2:  0,84,264,282,300.306,318,330,342,354  degrees 
M60A 1 :  84, 1 26, 1 50, 1 74, 1 92,2 1 0,264,288,3 1 8,35 4  degrees 
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Figure  3.2.9  (a)  Reduced  sample  set  chosen  using  noisy  0.5  km.  data.  Resubstitution  error=3.3% 

(b)  Reduced  sample  set  chosen  using  noiseless  0.5  km.  data.  Resubstitution  error=l  1.7% 
(Dots  denote  samples  NOT  chosen  for  the  reduced  sample  sets.) 
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3.2.3.4.4.  The  Feature  Space  and  Selection  of  the  Reduced  Sample  Set 

Figure  3.2.10  examines  the  class  densities  for  noisy  samples  using  a  different  feature 
space.  As  expected,  the  orientations  chosen  for  the  reduced  sample  set  in  this  feature  space  are 
different  from  those  chosen  in  the  previous  feature  space.  Also,  the  overlap  is  different  for  the 
two  feature  spaces  as  evidenced  by  the  resubstitution  errors. 

3.2.3.4.5.  Classification  Accuracy  and  the  Size  of  the  Reduced  Sample  Set 

Although  no  theoretical  analysis  was  performed  to  determine  an  optimal  size  for  the 
reduced  sample  set,  some  interesting  experimental  results  are  discussed  here.  Since  the  samples 
of  Figures  3.2.8(a)  and  3.2.10(a)  exhibited  class  overlap,  they  provided  an  excellent  opportunity 
to  determine  how  classification  accuracy  varies  with  the  size  of  the  reduced  sample  set. 

Ten  was  arbitrarily  chosen  as  the  size  of  the  reduced  sample  sets  used  in  all  of  the  previous 
plots,  and,  as  shown  by  the  graphs  of  Figure  3.2.11,  is  an  interesting  sample  size.  In  the 
Area-H2IArea  feature  space,  the  classification  accuracy  is  the  same  as  that  attained  using  all 
60  samples.  For  the  Area-Rectangularity  feature  space  the  ten  sample  set  produces  one  of  the 
best  classification  accuracies.  Better  than  that  attained  using  all  60  samples! 

It  is  important  to  note  that  these  experiments  do  not  establish  ten  as  an  important  number. 
The  results  presented  here  are  for  two  dimensional  feature  spaces  with  only  two  target  classes 
present.  The  classification  accuracy  vs.  sample  size  could  be  very  different  in  higher  dimen¬ 
sional  feature  spaces  with  multiple  target  classes. 

Nonetheless,  it  is  interesting  to  see  that  reduced  sample  sets  can  improve  classification 
accuracy.  This  may  occur  because  of  outlier  or  boundary  samples  in  large  sample  sets  which,  in 
addition  to  being  misclassified,  may,  when  included  in  the  design  set,  cause  the 
misclassification  of  other  samples. 

j.2.3.5.  SUMMARY 

Several  interesting  results  have  been  found  using  the  reduced  Parzen  classifier  of  Section 
3. 2. 3. 3.  It  was  interesting  to  see,  for  instance,  that  the  feature  samples  chosen  for  the  reduced 
sample  sets  were  not  spread  across  the  entire  class  density,  but  were  actually  grouped  together. 
This  may  not  seem  intuitive  from  a  density  estimation  point  of  view,  but  the  classification  accu¬ 
racies  exhibited  by  the  reduced  sample  sets  certainly  validate  it  from  the  classifier  design  per¬ 
spective.  It  was  also  instructive  to  see  that  a  carefully  selected  smaller  set  of  design  samples 
can  produce  classification  results  comparable  with  those  of  larger  design  sets. 

Finally,  the  results  of  this  study  clearly  show  that  it  will  not  be  possible  to  select  a  set  of 
key  target  orientations  which  will  provide  optimal  target  discrimination  under  all  noise  condi¬ 
tions. 
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Figure  3.2.10  (a)  Sixty  samples  of  each  target  class.  Resubstitution  error=2.5% 

(b)  Ten  samples  of  each  reduced  sample  set.  Dots  denote  samples  NOT 
chosen  for  reduced  sample  set.  Resubstitution  error=1.7% 
Corresponding  target  orientations: 

BRDM2:  18,42,66,90,96,138,156,174,306,312  degrees 
M60A1:  150,168,180,186,210,228,252,276,282,342  degrees 


Classification  Error  vs.  Set  Size 
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Figure  3.2.1 1  Some  classification  errors  obtained  using  various  reduced  sample 
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3.2J.6.  Future  Work 

The  results  of  this  study  should  be  verified  in  higher  dimensional  feature  spaces  with  mul¬ 
tiple  target  classes.  Also,  a  more  rigorous  study  to  determine  the  optimal  size  of  the  reduced 
sample  set  should  be  performed.  A  first  step  might  be  to  generalize  the  classification  error  vs. 
sample  size  experiment  of  this  study,  to  the  multiple  target,  N  dimensional  feature  space  situa¬ 
tion. 

3.23.7.  A  Survey  of  Classical  Clustering  Techniques 

What  follows  is  a  look  at  some  well-known  clustering  techniques.  One  primary  use  of 
clustering  (unsupervised  learning)  has  been  pattern  recognition  when  no  a  priori  labels  arc 
available. 

Our  hope  is  that  one,  or  a  combination,  of  these  classical  techniques  may  be  useful  in  parti¬ 
tioning  a  target  class  into  regions  of  similar  samples  that  may  be  represented  by  a  set  of  dis¬ 
tinguished  viewpoints. 

This  goal,  however,  may  meet  the  same  difficulties  seen  in  the  study  of  [KaYo88].  Clus¬ 
tering  is  a  completely  different  approach,  however,  and  it  may  be  reasonable  to  expect  groups  of 
similar  samples  to  stay  grouped  under  various  levels  of  imagery  noise  and  range  characteristics 
even  though  the  class  density  as  a  whole  is  completely  distorted. 

3.2.3.7.I.  Outline  of  methods 

Below  is  an  outline  of  some  of  the  oldest  and  most  well-known  clustering  techniques. 
These  techniques  fall  into  one  of  two  categories: 

1)  Hierarchical:  Elements  split/merged  at  one  level  remain  split/merged  at  all 
lower/higher  levels. 

2)  Non-hierarchical 

I.  Hierarchical  Clustering  Techniques 
A)  Agglomerative  Methods  (bottom-up) 

1 )  Linkage 

a)  Single-linkage  (Nearest-neighbor) 

•  Joins  clusters  by  their  two  nearest  elements 

•  Terminates  when  the  shortest  distance  between  the  nearest  elements  of  any 
two  clusters  exceeds  some  threshold 

•  Corresponds  to  the  minimum  spanning  tree  of  graph  theory  if  the  threshold 
is  not  met  and  grouping  continues  through  all  elements 

•  Has  a  "chaining"  characteristic  which  is  good  at  finding  long,  winding  clus¬ 
ters. 
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b)  Complete-linkage  (Furthest-neighbor) 

•  Joins  clusters  if  their  two  most  distant  elements  are  closer  than  some  thres¬ 
hold 

•  Terminates  when  no  clusters  may  be  merged  under  this  threshold 

•  Each  cluster  corresponds  to  a  complete  subgraph  in  graph  theory 

•  "Chaining"  is  discouraged. 

c)  Compromise 

•  Some  type  of  averaging  criterion  is  used  rather  than  the  extremum  meas¬ 
urements 

•  Tend  to  handle  outliers  better. 

2)  Centroid  Methods 

•  Clusters  with  the  closest  means  (centroids)  are  merged. 

3)  Error  Sum-of-Squares 

•  At  each  level  of  merging  the  clustering  must  be  the  one  which  increases  this  cri¬ 
terion  by  the  least.  (Ex.  [Wa63]) 

B)  Divisive  (top-down) 

•  Little  mention  in  the  literature.  Works  on  splitting  rather  than  merging. 

II.  Non-hierarchical 

A)  Fixed  number  of  clusters 

1)  Nearest  Centroid  Sorting 

•  Start  with  seed  elements  and  merge  in  elements  according  to  which  cluster  cen¬ 
troid  is  nearest.  (Ex.  [Fo65],  [Ma67]) 

B)  Variable  number  of  clusters 

•  These  algorithms  usually  ha  ~~,  ne  device  for  either: 

1)  reducing  the  number  of  •,  .  •  rs  if  two  clusters  are  very  near  one  another,  or 

2)  increasing  the  number  of  clusters  if  a  data  unit  is  too  far  from  all  existing  clus¬ 
ters. 

(Ex.  MacQueen’s  k-mean  method  with  coarsening  and  refining  parameters  [Ma67], 
ISODATA  (very  elaborate  and  expensive)  [BaHa65]) 

3.2.3. 7.2.  Some  Criteria  for  Driving  a  Clustering  Algorithm 

The  algorithms  above  use  various  criteria  for  determining  which  partition  is  optimum.  The 
nearest  centroid  sorting  methods,  for  instance,  try  to  minimize  the  error  sum-of-squares  by  con¬ 
structing  cloud-like  clusters  closely  packed  around  a  centroid.  Some  other  criteria  are: 
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•  Minimize: 


tr  5^/ ,  IS^/I ,  tr  Sj^  S\y 


15^1 

— — — ,  error  sum  -of  -squares 
\St  I 


•  Maximize: 

tr  SwSb 


where 


Sw=Z 

i= 1 


L  C X-M:)(X-Mi)T 
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f ;  O,  denotes  cluster  i 


SB=  £  nfM  i~My 

i=l 


ni= i 


Sj—Sw+Sb 


The  eigenvalues  of  Sw  measure  the  spread  of  the  samples  (within  their  respective  clusters) 
along  the  feature  space  coordinate  axes.  Minimization  of  tr  SV  is,  in  fact,  the  same  as  minimiz 
ing  the  error  sum-of-squares.  The  eigenvalues  of  SB,  on  the  other  hand,  indicate  the  spread  of 
the  clusters  within  the  entire  class  density  which  should  be  maximized.  The  tr  S^SB  criterion 
is  a  measure  of  the  ratio  of  these  with  the  between-class  spread  in  the  numerator. 

Note  that  since  the  total-scatter  matrix,  Sj,  is  invariant  to  the  cluster  partitioning  chosen,  a 
linear  transformation  can  be  used  to  make  the  scatter  white  (£7-=/)  thereby  reducing  these  cri¬ 
teria  to  functions  of  Sw- 

Other  types  of  criteria  have  also  been  used.  Some  work  has  been  done,  for  instance,  by 
viewing  classes  as  mixtures  of  multivariate  normal  distributions.  Examples  can  be  found  in 
[ScSy71]  and  [Wo70]. 


3.2.3.7.3.  Choosing  a  Method 

Most  clustering  techniques  are  good  at  identifying  a  specific  type  of  cluster.  For  example, 
the  single-linkage  method  has  a  unique  ability  to  find  long,  winding  clusters  through  a  charac¬ 
teristic  called  "chaining."  The  nearest  centroid  sorting  methods,  on  the  other  hand,  tend  to 
group  samples  into  "cloud-like"  clusters. 

The  best  strategy  in  approaching  a  clustering  problem  may  be  to  use  some  combination  of 
techniques.  Three  such  strategies  are  listed  by  [And73]: 

1)  Obtaining  a  rough  idea  of  the  clusters  with  some  inexpensive  algorithm  may  be  very 
useful  in  determining  which  of  the  more  elaborate  methods  to  use  next.  A  nearest 
centroid  sorting  method  may  be  most  useful  if  the  number  of  clusters  is  known,  but 
otherwise  some  hierarchical  technique  may  be  best. 
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2)  Once  clusters  are  found  their  removal  can  facilitate  later  processing.  After  a  particu¬ 
lar  technique  has  found  certain  types  of  clusters,  other  methods  may  be  used  to 
further  group  the  data. 

3)  A  hierarchical  method  may  be  used  as  a  way  of  "seeding"  nearest  centroid  sorting 
algorithms. 

3.2.3.7.4.  Summary 

There  are  a  multitude  of  algorithms  available  for  performing  cluster  analysis  and  in 
many  cases  some  combination  of  them  is  necessary  to  solve  the  problem.  The  works  of 
Scott  and  Symons,  and  Wolfe  which  treated  class  densities  as  mixtures  of  clusters  having 
normal  distributions  may  be  of  particular  interest  in  the  problem  of  determining  dis¬ 
tinguished  viewpoints  for  classifier  design. 

Before  placing  much  confidence  in  the  potential  of  the  above  techniques,  however, 
some  study  should  be  performed  to  determine  the  effects  of  noise  and  range  on  cluster 
structure.  It  may  be  that  the  set  of  distinguished  viewpoints  will  vary  with  these 
phenomena  as  seen  in  [KaYo88]. 

3.2.4.  Target  Detection  Using  Ladar  Data 

Recently,  the  Night  Vision  Lab  has  centered  much  interest  on  the  detection  of  tactical 
targets  using  LADAR  data.  However,  because  of  the  active  nature  of  a  LADAR  sensor 
and  the  amount  of  time  that  it  needs  to  gather  data,  it  is  advantageous  to  limit  the  number 
of  scans  that  the  sensor  makes  across  a  scene  during  the  detection  process.  In  fact,  it 
would  be  'deal  if  a  tactical  target  could  be  detected  using  data  from  a  single  LADAR  scan 
line.  Although  reliable  detection  using  a  single  LADAR  scan  line  would  be  ideal,  it  is 
generally  believed  that  more  than  a  single  scan  line  needs  to  fall  on  a  target  for  the  detector 
to  work  reliably.  Thus  the  following  question  arises:  How  many  LADAR  scan  lines  need 
to  fall  on  a  tactical  target  for  it  to  be  reliably  detected? 

We  have  undertaken  an  exhaustive  study  to  investigate  the  performance  of  a  target 
detector  when  data  from  a  limited  number  of  LADAR  scan  lines  is  available.  The  study  is 
aimed  at  determining  the  performance  of  the  detector  as  the  number  of  lines  that  fall  on  a 
target  is  varied.  We  are  also  investigating  detector  performance  when  various  preprocess¬ 
ing  techniques  are  applied  to  the  raw  LADAR  data. 

The  following  section  reports  on  the  results  of  detecting  a  target  using  a  single  scan 
line  from  a  LADAR  sensor.  Section  3.2.4.3  -  3.2.4.5  expands  these  experiments  to  see  if 
the  false  alarm  rate  can  be  reduced  by  using  multiple  scan  lines  are  on  target.  The  descrip¬ 
tion  of  the  code  used  to  generated  these  results  in  presented  in  Appendix  B. 
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3.2.4. 1.  Target  Detection  Using  a  Single  Line  of  LADAR  Data 

Night  Vision  Lab  has  centered  much  interest  on  the  detection  of  tactical  targets  using 
a  single  stripe  of  LADAR  data.  We  want  to  determine  if  a  single  scan  line  of  LADAR  data 
contains  enough  information  to  allow  a  system  to  determine  the  presence  and  approxi¬ 
mately  location  of  a  target.  In  this  section  we  present  two  approaches  to  this  problem,  the 
first  is  to  simply  measure  the  length  between  range  discontinuities,  and  if  the  length  falls 
between  two  thresholds,  it  is  labeled  as  a  target.  The  second  method  draws  on  detection 
theory  in  that  it  measures  the  characteristics  of  the  background  and  the  targets  in  a  known 
image,  and  then  classifies  an  unknown  scan  line  by  comparing  it  to  the  known  characteris¬ 
tics.  The  following  sections  present  more  details  about  each  of  the  approaches  and  shows 
that  the  second  approach  subsumes  the  first  approach. 

3.2.4.1.1.  a  Simple  Target  Detector 

This  section  presents  a  possible  solution  to  the  detection  problem  that  is  deterministic 
and  is  easily  expressed  algorithmically.  This  method  is  similar  to  the  approach  currently 
used  by  commercial  groups  today.  It  works  as  follows:  First,  the  scan  line  is  broken  up 
into  piecewise  linear  segments  (i.e.  if  the  range  values  of  a  pixel  is  within  a  certain  dis¬ 
tance  from  those  of  its  neighbors,  it  would  be  said  to  lie  in  the  same  piece).  Next  the 
detector  makes  a  decision  via  the  following  rule 

IF  segment  length  is  >  MIN_LENGTH 
AND  segment  length  is  <  MAX_LENGTH 
AND  there  is  a  jump  discontinuity  >  MIN_DISTANCE 
THEN  the  segment  represents  a  target. 

Because  of  the  effect  of  the  background  noise  and  the  statistical  variation  of  the  target 
range  values,  one  would  most  likely  find  that  such  a  detector  is  not  robust. 

3.2.4.2.  A  Detection  Theory  Based  Approach  to  Target  Detection 

We  have  chosen  to  apply  concepts  from  detection  theory  to  the  problem  of  robustly 
detecting  tactical  targets  with  a  limited  amount  of  LADAR  data.  We  are  able  to  apply 
these  concepts  by  temporarily  assuming  that  only  a  single  scan  line  of  LADAR  data  is 
used  to  detect  the  targets.  If  this  assumption  is  true,  then  the  detector  is,  in  effect,  working 
on  one  dimensional  signals.  Therefore,  the  problem  to  be  solved  is  the  classic  problem  of 
detecting  signals  with  unknown  parameters  in  noise  [Trees68]  where  the  signal  parame¬ 
ters  depend  on  the  type  of  target,  orientation  of  the  target,  and  position  of  the  scan  line  on 
the  target.  Detection  theory  describes  the  optimal  method  of  solving  this  type  of  problem 
by  using  estimates  of  the  probability  density  functions  of  the  target  and  background.  Once 
we  have  described  this  target  detection  method  using  a  single  LADAR  scan  line,  we  will 
show  how  to  generalize  it  so  that  multiple  LADAR  scan  lines  may  be  used  to  make  the 
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detection  process  more  robust. 


3.2.4.3.  Target  Detection  Using  a  Single  Scan  Line  of  LADAR  Data 

When  designing  the  detector,  it  is  helpful  to  assume  (temporarily)  that  all  targets  will 
be  a  constant  distance  from  the  LADAR  sensor  and  the  sensor  will  have  a  constant  field  of 
view.  Based  on  these  assumptions,  it  may  also  be  assumed  that  at  most  N  range  pixels 
from  a  single-line  fall  on  the  target.  Since  jump  discontinuity  information  provides  strong 
evidence  that  a  pixel  belongs  to  a  target,  it  is  advantageous  to  guarantee  that  at  least  one 
jump  discontinuity  will  be  included  in  the  data  for  any  target  pixel.  This  is  accomplished 
by  using  the  range  values  of  a  pixel  and  its  M=N+ 2  neighbors  on  the  same  scan  line.  The 
data  values  for  a  pixel  and  its  Af— 1  closest  neighbors  can  be  interpreted  as  a  data  vector 
representing  a  single  point  in  M  dimensional  space.  Therefore,  the  collection  of  all  possi¬ 
ble  target  (or  background)  data  vectors  comprises  a  probability  density  function  in  M 
dimensional  space.  Such  density  functions  can  be  estimated  and  the  estimates  used  by  a 
robust  target  detector.  One  type  of  target  detector  that  uses  estimated  density  functions 
works  as  follows:  Denote  the  density  function  corresponding  to  the  target  as 

f target  (^  1 ,  %  2 >  •  •  •  >  ) 

and  the  background  density  function  as 

f background 1 »  *2»  •  •  •* 

If  the  following  M  points,  Pq  =  (d\,  dj,  ■  ■  .,  correspond  to  the  data  vector  from 
some  pixel,  then 

/targeted \,  d 2,  .  .  .,  d\f) 

and 


f backgrounded  \,  dj,  ■  ■  dfrj) 

can  be  computed  at  Pq.  The  pixel  can  then  be  said  to  belong  to  an  object  if 

f targeted \ ,  dji  •  ■  •  ,  d\j)  >  C  f backgrounded \,  ^2,  ■  •  •»  d\f)  (3.1) 

where  C  is  equivalent  to  a  threshold  and  depends  on  a-priori  class  probabilities  and  the 
costs  of  false  alarms  and  detection  misses.  In  order  to  make  the  problem  tractable,  the 
functional  form  of  the  density  functions  are  usually  assumed  to  be  Gaussian;  thus,  the  den¬ 
sity  functions  can  be  expressed  as: 


Pi<X)  = 


-Viix-Mifzr'ex-Mi) 


1 

J 


where  L,  and  Af,  are  the  covariance  matrix  and  mean  of  density  i  respectively  and  the 
superscript  T  stands  for  the  transpose  of  a  matrix.  Substituting  this  functional  form  for  the 
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distribution  into  equation  (3.1)  and  taking  the  natural  log  of  both  sides  allows  us  to  classify 
a  pixel  as  belonging  to  a  target  if: 


exp 


\-V2(X-Mt)TZt-\X-Mt) 

c 

>  — — - exp 

-'/i{X-Mb)rlb~\X-Mb) 

l  J 

W  VHfcl 

The  subscripts  t  and  b  denote  the  target  and  background  densities,  respectively.  Taking  the 
negative  natural  log  allows  the  pixel  to  be  classified  as  belonging  to  a  target  if 

Vzln  11,1  +  V2(X-M,)TLt-1  <  -/n(C)  +  VMn  I E*  I  +  te(X-M*)rE*_1  (X-Mb) 


This  can  also  be  expressed  as 

(X-Mbflb-'  (X-Mb)  -  (X-M,)7!,-’  (X-Mt)  +  ln<| 


Hi,  I 

1 1, 1 


<  2 ln(C) 


The  left-hand  side  of  this  equation  is  commonly  called  the  log-likelihood  ratio.  Note  that 
if  the  value  for  a  pixel  is  replaced  by  its  log-likelihood  ratio, 

Hi,l 


(X-Mb)TZb~l  (X-Mb)  -  (X-M'fl'-1  (X-Mt)  +  Ini 


IE,  I 


then  the  optimal  value  of  C  can  be  determined  by  performing  a  threshold  over  the  entire 
image.  The  more  positive  values  of  the  log-likelihood  ratio  will  represent  pixels  that  are 
strongly  believed  to  be  pan  of  the  background  and  more  negative  values  of  the  log- 
likelihood  ratio  will  represent  pixels  that  are  strongly  believed  to  be  part  of  the  target.  We 
will  call  the  image  formed  by  replacing  every  pixel  by  its  log-likelihood  value  the  log- 
likelihood  image. 


Note  tnat  by  assuming  that  the  distributions  are  Gaussian,  the  decision  function  is 
completely  specified  by  the  mean  vectors  and  covariance  matrices  of  the  distributions  (Af„ 
E„  Mb,  Xb).  Thus,  these  are  the  only  parameters  that  the  detector  needs  in  order  to 
operate.  This  type  of  statistical  detection  scheme  is  fairly  robust  because  of  its  ability  to 
handle  random  variations  of  the  signal  coming  off  of  the  target.  It  also  minimizes  false 
alarms  by  compensating  for  the  statistical  nature  of  the  background.  Note  that  this  detec¬ 
tion  scheme  is  easily  augmented  to  handle  randomly  sized  targets  caused  by  objects  at 
varying  ranges  and  the  imaging  device  using  different  fields  of  view.  Randomly  sized  tar¬ 
gets  can  be  accommodated  by  resampling  the  scan  fine  around  the  pixel  being  classified  to 
produce  the  correct  sampling  density. 


It  should  be  noted  that  this  approach  subsumes  simpler  approaches  like  the  first  detec¬ 
tor  presented.  To  show  this,  we  will  look  at  a  simple  example.  For  the  purpose  of  the 
example,  assume  that  the  background  is  purely  random  and  that  the  target  is  a  box  of 
length  L  and  is  being  scanned  on  one  of  its  sides  as  is  show  in  Figure  3.2. 12.  In  the  deter¬ 
ministic  approach,  based  on  recording  lengths  of  constant  range  values,  there  are  two  pro¬ 
perties  that  a  linear  segment  must  possess  in  order  to  be  classified  as  a  target.  First  of  all, 


Target 

Box 


Scan 

Line 


Figure  3.2.12  A  single  LADAR  scan  line  scanning  a  box  of  length  L. 
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the  length  of  the  segment  must  be  between  two  bounds,  MIN_LENGTH  <  £  <  MAXJLENGTH 
wher  '  is  the  length  of  the  segment  actually  detected.  In  the  statistical  detector,  this  same 
requirement  is  expressed  by  forcing  target  pixels  to  have  £  highly  correlated  pixels.  The 
deterministic  approach  also  requires  that  a  jump  discontinuity  greater  than  the 
MIN_DIST ANCE  be  present  for  a  piece  to  be  classified  as  a  object.  This  requirement  also 
appears  in  the  statistical  detector;  it  is  expressed  as  highly  uncorrelated  pixels  with  mean 
values  that  differ  greatly  from  the  pixel  being  classified.  In  the  deterministic  approach  the 
decision  boundary  for  a  pixel  corresponds  to  the  rectangular  area  shown  in  Figure  3.2.13c. 
In  the  statistical  approach,  the  decision  boundary  can  have  arbitrary  shape  depending 
solely  on  the  shape  of  the  density  functions.  Some  commonly  used  decision  boundary 
shapes  appear  in  Figure  3.2.13. 

3.2.4.3.I.  Results  of  Detection  Experiments 

Five  experiments  were  run  to  test  the  detection  procedure  discussed  above.  Each 
experiment  was  run  on  six  images  representative  of  the  images  that  we  possess.  These 
sample  images  are  shown  at  the  bottom  of  Figures  3.1.  T3. 1.6.  Two  types  of  experiments 
were  run;  the  first  three  experiments  used  the  data  vector  generation  procedure  discussed 
in  Section  3.2.4.3  The  dimensionalities  of  the  data  vectors  in  these  experiments  were  M  = 
25,  1 1  and  5.  In  each  experiment  the  data  was  down-sampled  to  the  correct  dimensionality 
if  necessary.  After  the  images  were  run  through  the  detector,  some  simple  post-processing 
was  done  to  improve  the  output.  This  post-processing  consisted  of  labeling  as  false  detec¬ 
tions  all  detections  that  were  less  then  3  pixels  long.  Figure  3.2.14  (b-d)  shows  the  output 
of  the  detector  (after  post  processing)  for  the  first  three  experiments.  As  is  readily  seen, 
the  performance  of  the  detector  is  satisfactory  when  large  (M  =  25)  data  vectors  were  used. 
The  degradation  of  the  detector’s  results  when  using  small  dimensional  data  vectors  is 
probably  due  to  the  large  loss  of  information  that  occurs  when  the  data  is  drastically 
down-sampled.  Table  3.2.7  lists  the  probability  of  detection  and  the  false  alarm  rate  for 
the  first  three  experiments.  Note  that  the  parameters  listed  in  Table  3.2.8  do  not  reflect  the 
decrease  in  false  alarm  rate  or  probability  of  detection  produced  by  the  post-processing. 
Also  note  that  changing  the  decision  threshold  would  improve  detection  performance  for 
small  dimensional  data  vectors.  It  should  also  be  noted  that  the  detector  completely 
missed  some  of  the  targets  on  scan  lines  in  which  a  relatively  thir.  portion  of  the  target  was 
scanned  (e.g.  on  tank  turrets). 

A  second  type  of  detection  experiment  was  also  performed;  this  one  was  highly  tuned 
to  detect  targets  in  data  typical  of  the  first  LADAR  data  set.  In  these  experiments,  the 
detector  worked  on  the  same  principle  as  many  of  the  LADAR  segmenters  currently  under 
investigation.  Like  the  segmenters,  the  detector  based  its  decision  on  the  high  variance  of 
the  background  pixels.  In  this  scheme,  if  the  variance  of  the  range  values  of  the  local 
neighborhood  of  a  pixel  was  low,  the  pixel  was  labeled  as  belonging  to  a  target;  otherwise, 
it  was  labeled  as  background.  For  these  experiments,  the  data  vectors  were  composed  of  a 
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Figure  3.2.13  Commonly  used  decision  boundaries. 
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Figure  3.2.14  (continued) 
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Table  3.2.7  Probabilities  of  detection  and  false  alarm  rates  for  the  first  three  experiments. 


Experiment 

Dimension 

Probability  of 
Detection 

False  Alarm 

Rate 

1 

25 

85.8% 

7.3% 

2 

11 

82.3% 

17.8% 

3 

5 

73.8% 

16.9% 

pixel’s  range  value  and  the  range  values  of  the  pixel’s  M- 1  immediate  neighbors.  Two 
experiments  of  this  type  were  run.  The  data  vector  in  the  first  experiment  was  3  dimen¬ 
sional  and  consisted  of  the  range  value  of  a  pixel  and  the  range  values  of  its  left  neighbor 
and  its  right  neighbor.  A  5  dimensional  data  vector  was  used  in  the  second  experiment. 
The  vector  was  composed  of  the  range  value  of  a  pixel  and  the  range  values  of  the  two 
closest  neighbors  to  its  left  and  to  its  right.  Figure  3.2.14  (e,f)  shows  the  output  of  the 
detector  after  post-processing.  Table  3.2.8  indicates  the  probability  of  detection  and  false 
alarm  rate  of  the  last  two  experiments.  In  these  two  experiments  the  targets  were  detected 
fairly  reliably  because  of  the  high  variance  of  the  background  pixels.  It  is  expected  that 
performance  of  this  detection  scheme  will  drop  radically  when  absolute-range  LADAR 
data  becomes  available. 

Table  3.2.8  Probabilities  of  detection  and  false  alarm  rates  for  the  last  two  experiments. 


Experiment 

Dimension 

Probability  of 
Detection 

False  Alarm 

Rate 

4 

5 

90.9% 

9.6% 

5 

3 

84.6% 

9.0% 

3.2.4.3.2.  Conclusions 

These  two  types  of  experiments  were  run  to  demonstrate  the  flexibility  of  the  detec¬ 
tion  programs  and  to  show  that  detector  performance  depends  on  data  vector  composition. 
The  same  programs  were  used  in  both  experiments;  however,  the  data  vector  generator 
was  prevented  from  down-sampling  the  scan  lines  in  the  second  scheme.  This  approach  to 
detection  shows  a  relatively  low  false  alarm  rate,  especially  with  methods  4  and  5  which 
use  postprocessing  with  a  small  dimensional  space  to  achieve  an  error  rate  under  10%. 

3.2.4.4.  Target  Detection  Using  a  Multiple  Lines  of  LADAR  Data  -  Approach  1 

We  have  extended  our  work  on  LADAR  detection  to  investigate  the  effects  of  using 
more  than  one  scan  line  to  detect  a  target.  It  is  hoped  that  using  the  information  from  mul¬ 
tiple  scan  lines  will  improve  detector  performance.  For  this  report,  two  techniques  for 
multi-line  detection  were  investigated.  Both  techniques  are  simple  extensions  of  the 
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statistical  based  approach  discussed  previously. 

In  the  single  line  case,  a  range  pixel  was  classified  as  either  belonging  to  a  target  or  to 
the  background  based  on  its  range  value  and  the  range  values  of  M  -  1  neighbors  on  the 
same  scan  line.  An  obvious  extension  to  enable  this  procedure  to  handle  data  from  multi¬ 
ple  scan  lines  is  to  take  M  data  points  from  all  the  scan  lines  being  used  to  make  the  deci¬ 
sion.  Therefore,  if  L  lines  are  being  used  in  the  decision  process,  the  dimensionality  of  the 
data  vector  would  now  be: 

Dim  =  M  x  L 

where  Dim  is  the  dimensionality  of  the  multi-line  vector,  M  is  the  dimensionality  of  the 
single-line  vector  and  L  is  the  number  of  scan  lines. 

The  second  extension  to  multi-line  detection  is  slightly  more  complicated  than  the 
first.  Conceptually,  this  extension  can  be  viewed  as  L  independent  single  line  detectors 
whose  outputs  are  combined  to  produce  a  single  decision.  The  subdetectors  for  each  line 
work  exactly  in  the  same  manner  as  the  single  line  detector  previously  described.  After 
each  subdetector  decides  if  it  has  found  a  target,  it  sends  its  decision  to  the  main  detector 
along  with  the  confidence  associated  with  that  decision.  The  main  detector  then  combines 
the  results  from  the  subdetectors  and  arrives  at  a  single  decision  on  whether  a  target  is 
present  or  not.  Because  the  subdetectors  in  this  extension  are  independent,  the  size  of  the 
data  vectors  are  limited  so  that  only  M  pixels  at  any  time  were  needed  by  the  detector;  this 
is  a  large  improvement  over  the  first  extension  which  needed  M  xL  pixels  for  detection. 
In  the  previous  detection  experiments,  the  output  of  the  detector  indicated  only  if  the  pixel 
was  to  be  classified  as  target  or  background  and  gave  no  indication  of  the  certainty 
attached  to  that  decision.  This  detection  scheme  weights  the  evidence  provided  by  the 
subdetectors  according  to  the  confidence  values  attached  to  their  decisions.  It  does  this  as 
follows:  Define  Starget  as  the  sum  of  all  the  confidence  values  of  the  scan-lines  that  assert 
that  a  target  has  been  detected  and  as  the  sum  of  all  the  confidence  values  of  the 

scan-lines  that  deny  the  presence  of  a  target.  Finally  define  Stotai  as 

S  total  ~  $ target  ~  $  background 

Now,  if  Stotai  is  positive  then  the  assertion  is  made  that  there  is  a  target  somewhere  in  the 
lines  that  were  scanned;  otherwise,  the  presence  of  a  target  is  denied. 

Once  the  presence  of  a  target  is  hypothesized,  the  detector  tries  to  establish  which 
scan-lines  fell  on  the  target.  It  accomplishes  this  by  classifying  as  a  target  one  pixel  per 
scan  line  that  contributed  to  Starget. 

3.2.4.4.I.  Experimental  Procedure 

A  number  of  experiments  were  run  to  establish  if  either  extension  out  performed  the 
other.  For  each  experiment,  we  have  plotted  the  probability  of  detection  and  false  alarm 
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rate  when  there  were  between  one  and  five  scan  lines  on  target.  In  addition  to  the  two 
extended  detectors,  the  original  classifier  was  also  run  to  provide  a  baseline  for  com¬ 
parison.  Five  pixels  per  scan-line  were  used  in  the  experiments  on  the  extended  detectors. 
The  size  of  the  data  vector  for  the  baseline  experiment  depended  on  the  number  of  scan 
lines  used  in  the  extended  experiments.  For  example,  if  L  scan  lines  were  used  in  the 
multi-line  experiments,  5  x  L  were  used  in  the  baseline  experiment.  Thus  the  size  of  the 
final  data  vectors  for  the  baseline  and  first  extension  were  equal  for  all  experiments. 
Because  the  dimensionalities  of  the  two  data  vectors  are  equal,  it  should  be  possible  to 
compare  the  results  of  the  two  techniques  to  see  the  affect  of  using  multiple  lines  to  detect 
the  target  Otherwise,  it  would  not  be  possible  to  determine  if  any  increase  in  performance 
was  due  to  additional  information  contained  in  the  other  scan  lines  or  if  it  was  due  to  the 
increase  in  dimensionality  of  the  data  vectors. 

Four  sets  of  experiments  were  run  using  the  three  detection  techniques.  Two  of  the 
four  experiments  were  run  using  the  simple  post-processor  described  in  Section  3.2.4.3.I. 
This  post  processor  simply  removes  detections  that  are  shorter  than  a  specified  threshold. 
The  other  two  experiments  did  not  use  the  post-processor.  Theses  two  types  of  experi¬ 
ments  were  further  subdivided  based  on  how  the  probability  of  detection  and  false  alarm 
rate  were  defined. 


The  first  method  of  defining  the  above  statistics  was  the  same  as  the  method 
presented  for  the  single  line  case;  that  is,  on  a  pixel  by  pixel  basis.  In  this  case,  the  proba¬ 
bility  of  detection  can  be  defined  as 
T 

D  _  target 
‘detect  ~ 

‘  true 


where  Ttarget  is  the  number  of  target  pixels  classified  correctly  and  Ttrue  is  the  total  number 
of  target  pixels  present  in  the  ground-truth  image.  Likewise,  the  false  alarm  rate  can  be 
defined  as 


B false  ~ 


B  target 
&true 


where  Btarget  is  the  number  of  background  pixels  classified  as  target  pixels  and  Btrue  is  the 
total  number  of  background  pixels  present  in  the  ground-truth  image.  We  shall  call  this 
method  of  defining  the  statistics  the  single-pixel  method. 

The  second  method  of  defining  the  statistics  is  based  on  calling  any  group  of  con¬ 
nected  target  pixels  a  single  detection,  (a  group  of  pixels  on  the  same  scan  line  are  con¬ 
nected  if  all  of  them  are  labeled  as  target  pixels  and  each  of  them  is  adjacent  to  at  least  one 
other  pixel  in  the  group.)  If  any  pixel  of  a  detection  lies  on  the  target  then  the  target  is 
labeled  as  found  and  the  detection  is  said  to  be  correct.  In  this  method,  the  probability  of 
detection  can  be  defined  as 
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o  _  tar&a 

•  detect - ~ 

'  true 

where  Ttarget  is  the  number  of  target  detections  touching  the  target  and  T true  is  the  total 
number  of  target  stripes  present  in  the  ground-truth  image.  Likewise,  the  false  alarm  rate 
can  be  defined  as 

D  background 

Prake=~D^r 

where  Dbackground  is  the  number  of  detections  that  do  not  touch  a  target  stripe,  and  Dtotal  is 
the  total  number  of  detections  found  in  the  image.  We  shall  call  this  method  of  defining 
the  statistics  the  multi-pixel  method.  Note  that  there  is  a  fundamental  difference  in  how 
the  false  alarm  rate  is  defined  in  the  two  methods.  In  the  single  pixel  method,  the  false 
alarm  rate  is  defined  to  be  the  percentage  of  misclassified  background  pixels.  In  the 
multi-pixel  method,  the  false  alarm  rate  is  defined  to  be  the  percentage  of  misclassified  tar¬ 
get  detections. 

3.2A.4.2.  Experimental  Results 

Figures  3.2.15  and  3.2.16  show  graphs  from  the  experiment  with  no  post  processing 
using  the  single-pixel  method  of  generating  the  statistics.  These  graphs  are  shown  so  an 
idea  of  how  the  different  techniques  compare  can  be  gathered  without  any  masking  effects 
of  the  postprocessor  being  present.  Note  that  P detect  for  all  three  techniques  is  constant  at 
about  70%  independent  of  the  number  of  scan  lines  used.  However,  the  false  alarm  rate 
varies  significantly  depending  on  the  technique  and  the  number  of  scan  lines  used.  In  this 
case,  the  baseline  technique  gave  the  lowest  false  alarm  rate  followed  by  the  first  and 
second  extensions  in  that  order.  Please  note  that  the  statistics  yielded  by  all  three  detection 
techniques  will  be  equivalent  when  only  one  scan  line  is  used  because  both  extensions 
reduce  to  the  single-scan  line  detector  in  this  degenerate  case.  Figures  3.2.17  and  3.2.18 
are  graphs  of  the  statistics  for  the  same  experiment  using  the  multi-pixel  detection  cri¬ 
terion.  Figure  3.2.17  demonstrates  that  increasing  the  number  of  scan  lines  used  for  detec¬ 
tion  did  not  change  the  probability  of  detection  for  the  two  extended  techniques  but  did 
decrease  the  probability  of  detection  for  the  baseline  experiment.  This  decrease  is  thought 
to  be  due  to  the  decrease  in  the  total  number  of  detections  that  occurred  in  the  base  line 
case  as  the  number  of  scan  lines  were  increased.  Note  that  Figure  3.2.18  shows  that  about 
98%  of  all  detections  are  false  alarms.  It  also  demonstrates  that  changing  the  number  of 
scan  lines  used  had  little  affect  on  the  false  alarm  rate  in  all  three  techniques. 

Figures  3.2.19  -  3.2.22  show  how  detection  is  affected  when  a  simple  post-processor 
is  used  to  remove  some  false  alarms.  Figures  3.2.19  and  3.2.20  show  the  graphs  of  the 
single-pixel  statistics  when  the  postprocessor  was  used.  These  graphs  show  that  the  post¬ 
processor  reduced  both  P detect  and  P false-  They  also  show  that  almost  no  gains  in 
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Figure  3.2.15  Probability  of  detection  vs.  number  of  scan  lines  using  single-pixel  method  and 
no  postprocessing. 
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Figure  3.2.16  False  alarm  rate  vs.  number  of  scan  lines  using  single-pixel  method  and  no  post 
processing. 
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Figure  3.2.17  Probability  of  detection  vs.  number  of  scan  lines  using  multi-pixel  method  and 


no  postprocessing. 
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Figure  3.2.19  Probability  of  detection  vs.  number  of  scan  lines  using  single-pixel  method  and 
with  postprocessing. 
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Figure  3.2.20  False  alarm  rate  vs.  number  of  scan  lines  using  single- pixel  method  and  with 
postprocessing. 


IDO  OOO 


Prob  of  detect 


O  □  O 


baseline 

first  extension 
second  extension 


of  scan  lines  using  multi-pixel  method  and 


i  u 


ICO  ooc 


Falsa  Plarrr  Rata  ( u  !  t  h  pose  process  jog) 


8?  5000 


X  ?S  OCCO  -j 


68  50  0  0 


s:  :n:c  -j 


1  C.  '■  r  r.  : 

L  C  -  C  1 


baseline 


first  extension 


seeond  extension 


sc ap  i : neo 


Figure  3.2.22  False  alarm  rate  vs.  number  of  scan  lines  using  multi-pixel  method  and  with  rcp 
postprocessing. 
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performance  are  realized  by  increasing  the  number  of  scan  lines  when  the  post-processor 
was  used.  Finally,  Figures  3.2.21  and  3.2.22  show  the  same  experiment  (with  postproces¬ 
sor)  using  the  multi-pixel  statistic  generation  technique.  Both  P detect  and  P false  were 
reduced  in  this  experiment  also.  However,  The  number  of  scan  lines  did  affect  the  false 
alarm  rate  to  different  degrees  for  the  various  techniques.  In  this  experiment,  the  baseline 
experiment  benefited  the  most  by  increasing  the  number  of  scan  lines  used.  The  second 
extension  also  improved  slightly  with  an  increasing  number  of  scan  lines.  The  number  of 
scan  lines  used  had  no  affect  on  the  performance  of  the  first  extended  detector. 

3.2.4.4  J.  Discussion  and  Future  Work 

The  detection  results  from  both  techniques  were  disappointing.  Neither  technique 
showed  any  significant  improvement  when  the  number  of  scan  lines  used  was  increased. 
Furthermore,  if  either  technique  did  show  some  improvement  with  an  increasing  number 
of  scan  lines,  the  baseline  experiment  usually  showed  greater  improvement.  It  is  hoped 
that  some  other  techniques  can  be  used  to  extend  the  single  line  detector  to  show  improved 
performance  with  multiple  scan  lines. 

3.2.4.S.  Target  Detection  Using  Multiple  Lines  of  LADAR  Data  -  Approach  2 

The  single-line  detector  described  here  can  be  extended  to  detect  targets  using  multi¬ 
ple  lines  of  LADAR  data  if  the  value  for  each  pixel  is  replaced  by  its  log-likelihood  ratio. 
The  extension  is  achieved  by  combining  the  log-likelihood  ratios  for  multiple  lines  pro¬ 
duced  by  the  single-line  detector  into  a  single  value  for  each  pixel.  One  method  for  com¬ 
bining  the  log-likelihood  values  from  a  single-line  detectors,  based  on  a  weighted-voting 
scheme,  was  first  described  in  [KaYo87];  an  extension  of  that  method,  based  on  meta- 
classifiers,  is  used  to  combine  the  single-line  values  in  this  report.  This  combination  tech¬ 
nique  is  statistically  based  and  mirrors  the  classification  process  performed  by  the  single- 
line  detector. 

To  combine  the  log-likelihood  values  from  L  scan  lines,  the  log-likelihood  values  for 
the  pixels  undergoing  combination  are  first  grouped  into  a  single  L-dimensional  data- 
vector.  The  data-vectors  are  built  by  sampling  the  values  in  the  log-likelihood  image  per¬ 
pendicular  to  the  scanning  direction  (the  values  are  sampled  perpendicularly  so  that  they 
all  come  from  different  scan  lines).  As  in  the  single-line  case,  the  data-vectors  are 
assumed  to  be  samples  from  one  of  two  probability  density  functions,  one  for  the  target 
and  one  for  the  background.  We  can  represent  the  log-likelihood  density  function  for  the 
target  as 

/ 

f  tar  get  (l  1>  ^2>  •  •  •>  (f.) 

and  the  background  log-likelihood  density  function  as 

t 

f  background^  1  >  ^2>  •  •  •>  ^l)- 
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Using  the  process  described  in  previous  section,  we  can  compute 
/ 

f  target^  1>  ^2*  •  •  •»  ( L ) 

and 

f  background^  1 »  •  •  •»  ^ L ) 

for  each  sample.  The  pixel  can  then  be  said  to  belong  to  an  object  if 

t  /  / 

f  target^  1’  ^2*  •  •  •»  ^l)  ^  C  *  f  background^  1»  ^2*  •  •  ** 

where,  once  again,  C  is  equivalent  to  a  threshold.  If  the  density  functions  are  assumed  to 
be  Gaussian  then  the  pixel  can  be  classified  as  belonging  to  part  of  a  target  if 


C X-M'b)Ttb-\X-M'b )  -  (X-M' ,)Tt r\X-M' t)  +  In 


<  2 In  (C)  (3.2) 


Where  M  t,  M  X  b  are  the  mean  vectors  and  covariance  matrices  for  the  log- 

likelihood  target  and  background  densities,  respectively.  Note  that  inequality  (3.2)  is  the 

functional  expression  of  a  quadratic  classifier.  Once  again,  if  the  pixel  values  are  set  to  the 

• 

value  produced  by  the  left  hand  side  of  the  inequality  (3.2),  then  the  optimal  value  for  C 
can  be  found  by  thresholding  the  image  so  formed. 

It  is  informative  to  compare  the  multi-line  detection  scheme  described  in  this  report 
with  the  techniques  for  multiline  detection  discussed  in  [KaYo87].  The  first  technique 
presented  in  that  report  did  not  break  the  multi-line  detection  process  into  the  two  step  pro¬ 
cess  used  here.  Instead,  it  increased  the  dimensionality  of  the  sample  vector  to  include 
more  data  from  the  multiple  scan  lines.  Using  this  scheme,  if  the  dimension  of  the  data 
vector  for  each  line  is  M  and  L  scan  lines  are  used  in  the  detection  process,  then  the  dimen¬ 
sionality  of  the  multiline  data  vector  would  be  Dim  =  MxL.  Obviously,  this  process  can 
become  computationally  infeasible  very  quickly  as  the  single-line  dimensionality  or  the 
number  of  scan  lines  increases. 

The  second  extension  to  multiline  detection  presented  in  that  report  used  a  two-part 
process  similar  to  the  one  described  here.  However,  instead  of  using  a  quadratic  classifier 
to  combine  the  output  of  the  scan  lines,  as  the  current  method  does,  the  previous  scheme 
added  the  log- likelihood  values  for  the  scan  lines  being  combined.  If  the  sum  of  the  values 
was  less  than  zero,  the  detector  would  indicate  the  presence  of  a  target.  This  combination 
process  was  viewed  as  a  weighted-voting  scheme  where  the  weights  of  the  votes  were  pro¬ 
portional  to  the  decision’s  confidence.  To  relate  this  process  to  the  current  combination 
scheme,  it  should  be  noted  that  the  summation  process  can  be  viewed  as  classifying  the 
feature  vectors  (composed  of  the  log-likelihood  values  from  the  single-line  detector)  with  a 
linear  classifier.  It  is  possible  to  view  the  process  as  a  linear  classification  because  the 
summation  can  be  expressed  as  Stotai  =  W(X  where  Wl  is  a  M  dimensional  row  vector 
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whose  elements  are  all  1  ’s  and  X  is  the  data  vector.  Thus,  if  Stotai  was  negative,  the  detec¬ 
tor  would  assert  the  presence  of  a  target;  otherwise,  it  would  deny  a  target’s  presence. 
Since  the  linear  classifier  can  be  viewed  as  a  special  case  of  the  quadratic  classifier  (a  qua¬ 
dratic  classifier  degenerates  to  a  linear  classifier  if  the  covariance  matrices  of  all  classes 
are  equal),  the  current  scheme  using  a  quadratic  classifier  can  be  viewed  as  an  extension  to 
the  previous  voting  scheme.  Note  that  the  current  scheme  can  be  viewed  as  a  further  gen¬ 
eralization  of  the  previous  scheme  because  the  previous  scheme  used  a  fixed  threshold  of 
zero  where  the  current  scheme  allows  an  optimal  threshold  to  be  set. 

Since  we  are  comparing  the  current  experiment  with  those  performed  in  the  past,  the 
differences  between  the  data  in  the  new  experiment  and  the  data  in  the  old  experiments 
should  be  explored.  The  data  obtained  for  the  past  experiments  was  very  preliminary  and 
had  some  unusual  properties  that  could  be  exploited  by  target  detectors  and  segmenters. 
For  example,  valid  data  was  produced  by  the  sensor  for  only  a  few  ambiguity  intervals.  If 
the  sensor  did  not  detect  any  object  in  this  valid  range,  it  would  output  high  variance  noise. 
Thus,  the  technicians  at  the  data  collection  sight  had  to  tune  the  sensor  for  every  target 
scanned  to  guarantee  that  the  target  would  fall  within  the  valid  range  interval.  Although  it 
would  seem  that  the  sensor  providing  random  measurements  for  some  pixels  would  hinder 
target  detection  and  segmentation,  it  actually  made  both  processes  much  easier  (the  techni¬ 
cians  tunning  the  system  can  be  though  of  as  a  preprocessing  step  with  perfect  knowledge). 
Past  segmenters  and  detectors  were  able  to  take  advantage  of  the  variance  of  out  of  band 
signals  by  labeling  any  group  of  pixels  with  low  variance  as  a  target  and  all  other  pixels  as 
background.  Since  the  current  data  has  valid  AM  data  for  all  pixels,  this  technique  is  not 
available  with  the  current  data  set. 

3.2.4. 6.  Experimental  Procedure 

Three  sets  of  experiments  were  run  to  determine  the  performance  of  the  detector 
when  the  number  of  scan  lines  used  in  the  detection  process  was  varied.  A  different 
method  of  building  the  data  vector  was  used  in  each  experimental  set.  The  performance  of 
the  detector  was  determined  in  each  experimental  set  when  1,  2,  3,  4,  5,  10,  15,  20  and  25 
scan  lines  were  used  in  the  detection  process.  Furthermore,  the  performance  of  the  detec¬ 
tor  was  determined  when  both  horizontal  and  vertical  raster-scanning  was  used  to  scan  the 
image. 

In  the  first  experimental  set,  no  preprocessing  was  applied  to  the  image  before  the 
data  vectors  were  built.  In  this  experiment,  51  dimensional  data  vectors  were  used  for  the 
single-line  detector  when  horizontal  scanning  was  used;  38  dimensional  data  vectors  were 
used  for  vertical  scanning.  However,  when  horizontal  scanning  was  used,  only  50  (37  for 
vertical  scanning)  of  the  data  vector’s  values  were  taken  from  the  LADAR  resolved-range 
image;  the  final  value  for  the  data  vectors  was  set  equal  to  the  LADAR ’s  return  amplitude. 
The  experiment  in  which  no  preprocessing  was  applied  to  the  image  will  be  referred  to  as 
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the  raw-data  experiment.  Some  preprocessing  was  applied  to  the  data  in  the  second  exper¬ 
imental  set.  The  pixel  values  in  this  set  were  set  equal  to  the  resolved-range  value  modulo 
1875.  This  process  was  meant  to  recover,  to  the  greatest  extent  possible,  the  raw  AM 
LADAR  data.  Besides  this  preprocessing,  this  experiment  was  equivalent  to  the  first.  That 
is,  the  data  vectors  were  51  (38)  dimensional  with  50  (37)  values  representing  the 
resolved-range  and  one  value  representing  the  return-amplitude.  This  experiment  will  be 
referred  to  as  the  pseudo-AM  experiment.  The  data  vectors  used  in  the  final  experiment 
were  built  differently  then  those  used  in  the  first  two  experiments.  This  experiment  was 
designed  to  mimic  the  variance-based  segmenters  that  have  been  applied  to  previous 
LADAR  data  sets.  To  mimic  these  segmenters,  the  pixel  values  were  set  to  the  resolved- 
range  modulo  1875  as  in  the  pseudo-AM  experiment.  However,  the  data  vectors  were  5 
dimensional  in  this  experiment  and  were  built  from  the  values  of  the  pixel  and  its  two 
immediate  neighbors  on  each  side.  It  was  hoped  that  the  pseudo-AM  values  for  the  on- 
target  pixels  would  have  a  lower  variance  than  those  off-target;  the  detector  should  then  be 
able  to  distinguish  between  the  on-target  and  off-target  pixels  via  their  variances.  This 
experiment  will  be  referred  to  as  the  variance  experiment. 

In  both  the  single-line  and  multi-line  detection  procedures,  a  simple  post-processing 
scheme  was  applied  to  the  image  after  the  pixels  were  classified  by  the  detector.  This 
post-processing  scheme  deleted  all  detections  that  were  only  a  single  pixel  long  in  order  to 
lower  the  detector’s  false  alarm  rate. 

In  the  following  experiments,  any  connected  group  of  pixels  classified  as  belonging 
to  a  target  is  considered  to  be  a  single  detection.  (A  group  of  pixels  on  the  same  scan  line 
are  connected  if  all  of  them  are  labeled  as  target  pixels  and  each  of  them  is  adjacent  to  at 
least  one  other  pixel  in  the  group.)  If  any  pixel  of  a  detection  touches  the  target  then  the 
target  is  labeled  as  found  and  the  detection  is  said  to  be  correct.  If  more  than  one  detection 
touches  a  single  target  stripe,  then  the  target  stripe  is  only  counted  as  being  detected  once. 
Since  the  detections  are  defined  in  this  manner,  the  probability  of  detection  can  be  defined 
as 

n  _  ^target 

*  detect  —  rp 

1  true 

where  Ttarget  is  the  number  of  target  stripes  touched  by  at  least  one  detection  and  Ttrue  is 
the  total  number  of  target  stripes  present  in  the  ground-truth  image.  Likewise,  the  false 
alarm  rate  can  be  defined  as 

D  _  t) background 

where  Dbackground  is  the  number  of  detections  that  do  not  touch  a  target  stripe,  and  Dtotaj  is 
the  total  number  of  detections  found  in  the  image.  Note  that  this  defines  the  false  alarm 
rate  to  be  the  percentage  of  target  detections  that  are  incorrect. 
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The  data  used  to  determine  the  detector’s  performance  consisted  of  the  four  typical 
LADAR  images  described  in  Section  3.1.2;  these  images  are  shown  in  Figure  3.2.23.  The 
targets  in  the  images  were  hand-segmented  to  provide  ground  truth  values  for  the  detector 
training  process.  The  borders  of  the  targets  are  shown  in  Figure  3.2.23  as  white  outlines. 
Note  that  the  results  of  the  experiments  presented  here  are  better  than  could  be  expected  in 
the  real  world  because  the  detector  was  trained  and  run  on  the  same  data  set. 

3.2.4.7.  Experimental  Results 

Figures  3.2.24,  3.2.25  and  3.2.26  show  the  log-likelihood  images  produced  by  the 
single-line  detector  for  the  raw,  the  pseudo-AM  and  the  variance  experiments,  respec¬ 
tively.  Pixels  in  these  images  with  negative  log-likelihood  values  (shown  as  the  light  pix¬ 
els  here)  are  pixels  that  the  detector  hypothesized  as  belonging  to  part  of  a  target.  As  indi¬ 
cated  by  these  images,  the  single-line  detector  performed  very  poorly  on  the  LADAR  data. 
Figures  3.2.27,  3.2.28  and  3.2.29  show  the  detector  statistics  for  the  three  experiments; 
they  show  the  probability  of  detection  and  false  alarm  rates  for  a  range  of  thresholds.  The 
probability  of  detection  is  indicated  in  these  figures  with  a  solid  line  and  the  false  alarm 
rate  is  indicated  with  a  dashed  line.  Although  the  three  detectors  performed  poorly,  we 
believe  that  the  statistics  reported  are  biased  by  the  large  amount  of  background  area. 
Since  there  are  about  two  orders  of  magnitude  more  background  pixels  than  target  pixels,  a 
small  percentage  of  background  pixels  misclassified  as  belonging  to  a  target  will  result  in  a 
large  false  alarm  rate.  Note  that  in  each  of  these  experiments,  the  false  alarm  rate  when 
vertical  scanning  was  used  was  slightly  lower  than  when  horizontal  scanning  was  used. 
We  believe  that  this  effect  is  due  to  the  nonhomogeneity  of  the  horizontal  scan  lines  that 
made  it  difficult  to  model  the  samples  with  a  single  density  function.  When  horizontal 
scanning  is  used  to  scan  the  image,  the  properties  of  the  scan  lines  differed  significantly 
with  their  placement  on  the  images.  For  example,  scan  lines  near  the  top  of  the  images 
consisted  almost  entirely  of  "sky"  pixels;  these  pixels  typically  had  very  small  return- 
amplitude  values  and  took  random  range  values.  Scan  lines  from  the  center  of  the  image 
on  down  usually  produced  progressively  smaller  range  values  as  the  position  of  the  scan 
line  moved  down  the  image.  Conversely,  when  vertical  scan  lines  were  used  to  scan  the 
image,  all  scan  lines  exhibited  the  same  properties  regardless  of  their  position  in  the 
image.  As  the  scan  moved  down  the  scan  line  from  the  top,  most  scan  lines  started  with 
"sky”  pixels  and  progressed  to  valid  range  pixels.  Furthermore,  for  multi-line  detection, 
since  targets  usually  are  wider  than  the  are  tall,  vertical  scan  lines  can  be  spaced  farther 
apart  than  horizontal  scan  lines  can  and  still  provide  the  same  number  of  scan-lines  on  tar¬ 
get. 

Figures  3.2.30  -  3.2.35  show  the  probability  of  detection  and  false  alarm  rates  for  the 
three  experimental  sets  when  multiple  scan  lines  were  used  to  detect  the  targets.  These 
figures  show  the  detection  statistics  for  a  range  of  thresholds.  (The  statistics  shown  in 
these  figures  for  large  thresholds  (>  200)  is  meaningless.  If  the  detection  with  the 
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Figure  3.2.23  This  figure  shows  the  unprocessed  data  used  in  the  three  experiments.  The  out¬ 
lines  of  the  hand-segmented  targets  used  to  train  the  detector  are  shown  in  white. 


Figure  3.2.24  This  figure  shows  the  log-likelihood  images  produced  by  the  detector  when 
presented  with  raw-data.  The  top  four  images  are  the  output  when  vertical  scan¬ 
ning  is  used  and  the  bottom  four  show  the  output  when  horizontal  scanning  is 
used. 
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Figure  3.2.25  This  figure  shows  the  log-likelihood  images  produced  by  the  detector  when 
presented  with  pseudo-AM  data.  The  top  four  images  are  the  output  when  verti¬ 
cal  scanning  is  used  and  the  bottom  four  show  the  output  when  horizontal  scan¬ 
ning  is  used. 
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Figure  3.2.26  This  figure  shows  the  log-likelihood  images  produced  by  the  detector  when 
presented  with  variance  data.  The  top  four  images  are  the  output  when  vertical 
scanning  is  used  and  the  bottom  four  show  the  output  when  horizontal  scanning 
is  used. 


Figure  3.2.27  This  figure  shows  the  probability  of  detection  (solid  line)  and  false  alarm  rates 
(dashed  line)  for  the  raw-data  experiment  using  a  single  scan  line.  The  top  and 

bottom  graphs  show  the  statistics  for  vertical  and  horizontal  scanning,  respec¬ 
tively. 
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Figure  3.2.28  This  figure  shows  the  probability  of  detection  (solid  line)  and  false  alarm  rates 
(dashed  line)  for  the  pseudo-AM  experiment  using  a  single  scan  line.  The  top 
and  bottom  graphs  show  the  statistics  for  vertical  and  horizontal  scanning, 
respectively. 
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Figure  3.2.29  This  figure  shows  the  probability  of  detection  (solid  line)  and  false  alarm  rates 
(dashed  line)  for  the  variance  experiment  using  a  single  scan  line.  The  top  and 
bottom  graphs  show  the  statistics  for  vertical  and  horizontal  scanning,  respec- 
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Figure  3.2.31  This  figure  shows  the  probability  of  detection  (top  graph)  and  false  alarm  rates 
(bottom  graph)  for  the  raw-data  experiment  using  vertical  scanning.  Both 
graphs  show  the  statistics  for  a  number  of  scan  lines  and  a  range  of  thresholds. 
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Figure  3.2.32  This  figure  shows  the  probability  of  detection  (top  graph)  and  false  alarm  rates 
(bottom  graph)  for  the  pseudo-AM  experiment  using  horizontal  scanning.  Both 
graphs  show  the  statistics  for  a  number  of  scan  lines  a.  J  a  range  of  thresholds. 
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Figure  3.2.34  This  figure  shows  the  probability  of  detection  (top  graph)  and  false  alarm  rates 
(bottom  graph)  for  the  variance  experiment  using  horizontal  scanning.  Both 
graphs  show  the  statistics  for  a  number  of  scan  lines  and  a  range  of  thresholds. 
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Figure  3.2.35  This  figure  shows  the  probability  of  detection  (top  graph)  and  false  alarm  rates 
(bottom  graph)  for  the  variance  experiment  using  vertical  scanning.  Both  graphs 
show  the  statistics  for  a  number  of  scan  lines  and  a  range  of  thresholds. 
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strongest  belief  was  correct  then  these  values  indicate  that  there  is  a  0%  false  alarm  rate 
for  the  large  thresholds;  otherwise,  the  values  indicate  that  there  is  a  100%  false  alarm  rate 
for  the  large  thresholds.  In  either  case,  the  correctness  of  the  detection  with  the  largest 
belief  is  insignificant.)  As  can  be  seen  by  the  graphs,  only  a  very  subtle  improvement  was 
realized  when  an  increased  number  of  scan  lines  was  used  to  detect  the  targets.  For  some 
of  the  experiments,  it  was  easier  to  select  a  valid  detection  threshold  when  a  large  number 
of  scan  lines  were  used  to  detect  the  targets.  When  only  a  small  number  of  scan  lines  were 
used  to  detect  the  targets,  there  was  only  a  small  number  of  thresholds  where  the  detector 
did  not  classify  all  pixels  as  belonging  to  part  of  the  target  or  classify  all  pixels  as  belong¬ 
ing  to  part  of  the  background.  The  number  of  valid  detection  thresholds  in  the  transition 
region  between  these  two  cases  grew  as  the  number  of  scan  lines  increased.  Figures 
3.2.36,  3.2.37,  and  3.2.38  show  the  log- likelihood  images  for  the  three  experiments  when 
25  lines  were  used  to  scan  the  image. 

3.2.4.8.  Discussion 

The  results  produced  by  the  statistical  detector  for  all  three  experiments  is  disappoint¬ 
ing.  The  false  alarm  rate  in  all  three  experiments,  due  to  the  large  area  of  the  background, 
was  large  enough  to  render  the  detector  unusable.  We  are  also  disappointed  with  the  ina¬ 
bility  of  the  multi-line  detector  to  improve  upon  the  results  of  the  single-line  detector.  We 
believe  that  the  output  of  the  single-line  detector  was  so  poor  that  there  was  not  enough 
information  in  the  log- likelihood  images  that  it  produced  to  enable  the  multi-line  detector 
to  improve  the  performance.  Because  of  the  poor  performance  of  the  detector,  we  do  not 
believe  that  there  is  much  promise  for  the  development  of  a  reliable  detector  based  on  the 
simple  statistical  scheme  presented  here;  thus,  we  believe  that  resources  would  be  better 
spent  investigating  other  avenues  for  detecting  tactical  targets  with  a  limited  amount  of 
LADAR  data. 

3.3.  LOW  tJSVEL  PROCESSING  OF  LADAR  DATA 

This  section  describes  our  progress  in  the  low  level  processing  of  LADAR  data  which 
will  feed  the  geometric  methods  in  Section  3.4.  As  proposed  in  [KaYo88],  we  have 
moved  beyond  our  initial  low  level  edge  detection  and  component  labeling  scheme  for  sur¬ 
face  extraction  in  favor  of  a  more  noise  immune  region  growing  approach.  We  have  also 
implemented  two  optional  preprocessing  steps  to  help  diminish  the  effects  of  the  noise  so 
prevalent  in  current  LADAR  imagery.  Processing  results  illustrating  the  performance  of 
our  new  algorithms  are  presented  for  both  synthetic  and  actual  LADAR  data.  The  algo¬ 
rithms  are  also  evaluated  more  rigorously  via  a  newly  defined  feature  set  for  surfaces  and 
several  performance  measures. 


Figure  3.2.36  This  figure  shows  the  raw-data  log-like  fi  hood  images  produced  by  the  multi-line 
detector  when  25  lines  were  used  to  scan  the  images.  The  top  four  images  are 
the  output  when  vertical  scanning  is  used  and  the  bottom  four  show  the  output 
when  hori/.outnl  scanning  is  used. 


Figure  3.2.37  This  figure  shows  the  pseudo-AM  log-likelihood  images  produced  by  the  multi- 
line  detector  when  25  lines  were  used  to  scan  the  images.  The  top  four  images 
are  the  output  when  vertical  scanning  is  used  and  the  bottom  four  show  the  out¬ 
put  when  horizontal  scanning  is  used. 


Figure  3.2.38 


This  figure  shows  the  variance  log-likelihood  images  produced  by  the  multi-line 
detector  when  25  lines  were  used  to  scan  the  images.  The  top  four  images  are 
the  output  when  vertical  scanning  is  used  and  the  bottom  four  show  the  output 
when  horizontal  scanning  is  used. 
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3.3.1.  Low  Level  LADAR  Range  Data  Processing 

We  are  developing  the  special  purpose  low  level  software  necessary  to  process 
LADAR  imagery.  Thus  far  we  have  implemented  the  surface  segmentation  scheme 
described  in  Section  3.4.2. 1.1.  This  section  addresses  some  of  the  difficulties  encountered 
when  processing  LADAR  data  and  presents  the  algorithms  and  techniques  we  have 
designed  to  overcome  them.  Determination  of  Surface  Normals 

Aside  from  the  ( x,y,z )  locations  of  pixels,  surface  normals  are  the  most  important 
information  we  can  get  out  of  a  range  image.  It  is  important  to  do  the  best  job  possible 
when  determining  surface  normals  because  most  subsequent  processing  steps  use  them. 
Since  they  are  based  on  first  order  derivatives  of  the  range  map,  however,  their  computa¬ 
tion  is  susceptible  to  the  effects  of  quantization  error,  dropouts,  sparseness  of  data,  and 
noise. 

Figure  3.3.1  is  a  range  image  of  an  M60A1  tank  which  was  generated  using  our  Elec¬ 
tronic  Terrain  Board  Model  (see  Section  4).  The  resolution  of  the  range  image  is  compar¬ 
able  to  a  LADAR  image  taken  at  500  meters,  in  the  sense  that  the  sampling  in  the  x  and  y 
directions  is  similar  to  that  of  LADAR  at  that  range.  However,  our  simulated  tank  image 
has  greated  resolution  in  the  z  direction  (the  full  dynamic  range  of  gray  scale  values  is  used 
on  target).  The  coordinate  system  convention  we  have  adopted  for  range  images  is  that  the 
image  sits  in  the  x,y-plane ,  with  the  origin  in  the  upper-left-hand  comer,  the  x-axis  point¬ 
ing  from  left  to  right,  and  the  y-axis  pointing  from  top  to  bottom.  The  positive  z-axis  must 
be  the  cross  product  of  the  x-  and  y-axes,  and  so  points  into  the  image. 

We  computed  surface  normals  for  the  tank  image  by  fitting  symmetric  planar  patches 
to  the  data  as  described  in  Section  3.3.3. 1.  The  (x,y,z)  components  of  a  surface  normal  are 
easily  expressed  in  terms  of  the  coefficients  of  the  planar  patch  fit  equation  z  =  ax  +  by  +  c 
as:  ( Sx,Sy,Sz )  =  ( a,b ,  -1).  Figures  3.3.2  and  3.3.3  illustrate  the  ( x,y,z )  components  of  the 
surface  normals  computed  using  3  by  3  and  5  by  5  windows,  respectively.  The  darker  the 
pixel  in  the  images,  the  larger  the  particular  component  of  its  surface  normal.  The  wavy 
noise  in  the  component  images  is  due  to  the  quantization  error  of  PADL-generated  8-bit 
integer  range  images.  While  the  effects  of  having  only  256  available  gray  values  in  a 
range  image  are  not  apparent  to  the  naked  eye,  moderately  sized  window  operators  can 
detect  the  step  from  one  quantization  level  to  the  next  as  one  moves  along  an  oblique  sur¬ 
face.  Using  windows  that  are  too  small  can  cause  these  steps  to  appear  as  false  edges,  as 
demonstrated  by  the  3  by  3  windows  of  Figure  3.3.2.  Using  larger  windows  reduces  the 
effect,  but  can  make  actual  edges  appear  blurry,  as  seen  in  Figure  3.3.3.  If  windows  that 
are  too  large  are  used,  the  edges  may  become  transitional  surfaces  between  the  actual  sur¬ 
faces  they  separate. 

We  have  devised  a  way  to  remove  the  effects  of  quantization  error  and  some  other 
noise  without  producing  fat  edges  or  false  surfaces.  The  surface  normals  are  determined  in 
a  two  step  process  [KaCh87,  ChKa88].  The  first  step  fits  symmetric  planar  patches  as 


(c)  Z  Component 


Figure  3.3.2  Surface  normal  components  using  3  by  3  windows. 
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before,  and  computes  the  mean  square  fit  error  for  each  patch.  The  window  size  used  is 
large  enough  to  remove  the  effects  of  noise.  We  found  that  9  by  9  windows  work  well  for 
targets  at  500  meter  resolution.  During  the  second  step  another  pass  is  made  over  the 
image,  examining  the  fit  error  of  pixels  within  the  processing  window  used  to  compute  the 
surface  normal  of  the  current  pixel.  These  pixels  contributed  to  the  computation  of  the 
current  pixel’s  planar  patch,  and  conversely,  the  current  pixel  contributed  to  the  computa¬ 
tion  of  theirs.  We  may  therefore  conclude  that  if  the  fitted  planar  patch  for  one  of  these 
neighboring  pixels  has  smaller  fit  error  en  than  that  of  the  current  pixel  ec,  it  is  reasonable 
to  consider  the  current  pixel  to  be  part  of  that  better  fitting  patch  (since  it  contributed  to  it), 
and  to  assign  the  same  surface  normal  to  the  current  pixel.  Since  pixels  in  the  same  pro¬ 
cessing  window  could  actually  be  any  distance  away  from  the  current  pixel  in  3-D  space, 
we  normalize  their  fit  error  by  this  distance  when  we  make  the  comparison: 

&cn  ^  ec  , 

where 

eCn  =  en(^(xc-rn)2  +  (yc-ynf  +  (zc-z„)2  +  1) 

is  the  neighbor’s  error  normalized  by  its  distance.  This  prevents  pixels  from  being 
assigned  to  surfaces  which  are  a  substantial  distance  away  from  them,  and  makes  the  reas¬ 
signment  process  local. 

Figure  3.3.4  is  a  2-D  example  illustrating  the  two  step  process.  Figure  3.3.4(a)  shows 
a  range  map  of  four  surfaces  surfaces  separated  by  the  three  types  of  edges  we  are 
interested  in  detecting  (see  Section  3.4.2. 1.1  and  Section  3. 3. 1.2).  The  bars  below  the  je¬ 
an's  indicate  the  consecutive  positions  of  a  5  pixel  wide  processing  window  as  it  moves 
across  the  data.  At  each  window  position  a  minimal  mean  square  error  linear  fit  is  com¬ 
puted  for  the  sample  points,  corresponding  to  fitting  a  planar  patch  in  the  3-D  case.  The 
vector  in  the  x,z-plane  orthogonal  to  the  fitted  line  (corresponding  to  a  3-D  surface  normal) 
is  associated  with  the  pixel  in  the  center  of  the  window,  as  is  the  fit  error  ec.  Figure 
3.3.4(b)  illustrates  the  "normals"  for  these  fitted  "patches."  Figure  3.3.4(c)  shows  the  "sur¬ 
face  normal"  assignments  made  during  the  second  pass  over  the  data  by  examining  the  fit 
error  ecn  of  pixels  within  the  processing  window  used  to  calculate  the  current  "normal"  and 
fit  error  ec.  Remember  that  ecn  is  a  neighbor’s  fit  error  normalized  by  its  distance  away 
from  the  current  pixel  (the  one  in  the  center  of  the  window).  Note  that  the  "normals"  of 
Figure  3.3.4(c)  are  much  better  than  those  of  (b),  and  will  produce  better  results  when  used 
in  later  processing. 

The  above  procedure  has  a  two-fold  effect.  First,  it  performs  a  kind  of  smoothing, 
producing  more  uniform  surfaces  and  eliminating  noise.  Second,  it  automatically  does 
edge  thinning,  yielding  surfaces  with  maximal  area.  Figure  3.3.5  shows  the  surface  normal 
components  computed  for  the  range  image  of  Figure  3.3.1  using  a  9  by  9  window.  Now 
that  we  have  the  best  surface  normals  possible  without  extensive  preprocessing,  our  later 
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(a)  Range  Map  and  5  Pixel  Wide  Windows 


(b)  Normals  For  Fitted  Patches 


(c)  Assigned  Normals  After  Examining  Fit  Error 


Computation  of  surface  normals. 


(c)  Z  Component 


(d)  Modified  Range  Pixels 


Figure  3  3  5  (aMc)  Surface  normal  components  using  9  by  9  windows  and  reassignment  to 
patches  with  minimal  error,  (d)  Range  pixels  modified  to  improve  curvature  cal¬ 
culation. 
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processing  should  also  generate  better  results. 

3.3.1.1.  Curvature  Computation 

If  we  were  to  estimate  curvature  by  merely  convolving  the  range  map  with  the  win¬ 
dow  operators  described  in  [BeJa85,  BeJa86,  YaKa86a],  we  would  not  be  taking  advan¬ 
tage  of  the  noise  removal  performed  by  our  surface  normal  determination  technique.  Since 
curvature  computation  requires  second  order  derivatives  of  the  range  map,  the  estimate  is 
already  very  sensitive  to  noise.  In  order  to  get  better  results,  a  modified  range  map  is 
created  as  surface  normal  assignment  is  performed  during  the  second  step  of  the  aforemen¬ 
tioned  technique.  If  the  ecn  of  a  neighboring  pixel  is  less  than  ec,  then  as  the  neighboring 
surface  normal  is  assigned  to  the  current  pixel,  the  current  pixel  is  also  moved  in  the 
modified  range  map  to  its  projected  position  on  the  better  fitting  planar  patch.  This 
modified  range  map  is  then  convolved  with  curvature  window  operators  which  are  the 
same  size  as  the  surface  normal  computation  windows. 

Figure  3.3.5(d)  shows  the  range  pixels  which  were  moved  in  the  modified  range  map 
of  the  tank  image.  White  pixels  were  not  moved  at  all,  gray  pixels  were  moved  a  negligi¬ 
ble  amount,  and  pixels  in  the  gray  to  black  range  (like  those  near  the  edges  of  the  gun  bar¬ 
rel)  are  darker  if  they  were  moved  farther.  The  images  of  Figure  3.3.6  show  the  signs  of 
the  Gaussian  and  mean  curvatures  obtained  by  convolving  the  modified  range  map  with  9 
by  9  window  operators.  Gray  indicates  roughly  zero  curvature  (planar),  white  is  positive 
curvature  (convex),  and  black  is  negative  curvature  (concave).  Note  that  even  with  the 
steps  taken  to  provide  a  better  range  map,  curvature  is  still  fairly  noisy  even  for  synthetic 
images  due  to  the  fact  that  it  is  a  second  order  operation. 

3.3.1.2.  Edge  Detection  and  Surface  Labeling 

The  next  task  is  to  find  the  edges  between  surfaces,  and  then  label  the  individual  sur¬ 
faces.  The  types  of  edges  to  be  found  include  jump  edges,  curvature  maxima  edges,  and 
surface  normal  disparity  edges,  as  described  below.  The  edge  labels  retain  information 
about  the  pixel’s  specific  edge  type,  which  is  useful  during  high  level  processing  of  the 
scene. 

The  most  obvious  edges  occur  where  3-D  distance  from  a  point  to  one  of  its  neigh¬ 
bors  is  greater  than  some  threshold.  A  pixel  meeting  this  criteria  is  labeled  as  a  jump  edge. 
In  order  to  prevent  every  pixel  on  an  oblique  surface  from  being  labeled  as  an  edge,  just 
the  ones  at  the  front  and  back,  the  range  discontinuity  to  the  neighbor  on  the  opposite  side 
of  the  one  under  question  must  be  significantly  less.  Figure  3.3.7(a)  shows  jump  edges 
detected  for  the  tank  image.  Both  coarse  and  fine  thresholds  are  used.  A  coarse  jump  edge 
would  separate  the  top  of  a  target  from  background,  while  a  finer  threshold  allows  detec¬ 
tion  of  jumps  resulting  from  one  surface  of  the  object  occluding  another,  such  as  the  gun 
barrel  or  hatch  occluding  a  portion  of  the  turret. 


(a)  Gaussian  Curvature 


(b)  Mean  Curvature 


Figure  3.3.6  Signs  of  curvatures  calculated  by  convolving  modified  range  map  with  9  by  9 
window  operators.  Gray  is  zero  curvature  (planar),  black  is  negative  curvature 
(concave),  and  white  is  positive  curvature  (convex). 


Figure  3.3.7  Detected  edges. 
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The  next  most  obvious  boundary  between  surfaces  is  formed  by  points  where  surface 
normals  differ  significantly  in  orientation.  These  are  known  as  surface  normal  disparity 
edges.  We  use  a  threshold  value  of  25  degrees  difference  in  orientation  to  identify  these 
pixels.  Figure  3.3.7(b)  is  an  example  surface  normal  disparity  edge  image.  The  results  are 
very  good  because  of  good  surface  normal  determination. 

An  improvement  in  the  performance  of  edge  segmentation  can  sometimes  be  gained 
by  considering  curvature  information.  Thresholds  corresponding  to  some  minimum  allow¬ 
able  radii  of  curvature  are  set,  and  if  the  magnitude  of  the  mean  of  Gaussian  curvatures 
exceed  these  thresholds,  the  pixel  is  labelled  as  curvature  maxima  edge.  Whether  the  edge 
is  concave  or  convex  may  be  determined  by  looking  at  the  sign  of  the  curvature.  Figure 
3.3.7(c)  illustrates  curvature  edges  for  a  minimum  allowable  radius  of  a  tenth  of  a  meter 
for  both  Gaussian  and  mean  curvature.  These  edges  are  poorer  than  the  previous  types 
because  curvature  is  generally  noisy.  However,  if  one  looks  back  at  the  sign  of  the  mean 
curvature  image  in  Figure  3.3.6,  one  can  immediately  tell  which  edges  are  convex  and 
concave. 

The  remaining  pixels  are  now  given  a  label  corresponding  to  what  surface  they 
belong  to  via  a  connected  component  labeling  algorithm.  Figure  3.3.8  shows  the  resulting 
labels  when  curvature  information  is  and  isn’t  taken  advantage  of.  In  these  images,  each 
surface  is  given  a  unique  gray  value.  Note  that  even  though  the  curvature  edges  are  not  as 
clean,  they  do  help  fill  in  the  gaps  between  other  edges  which  lead  to  incorrect  labeling. 

3.3.I.3.  Results  on  Noisy  Images 

The  M60A1  tank  image  we  have  been  working  with  was  downsampled  by  a  factor  of 
four  to  obtain  an  image  roughly  equivalent  in  resolution  to  a  LADAR  image  taken  at  two 
kilometers  (same  x  and  y  resolution,  but  finer  z  resolution  as  discussed  in  the  beginning  of 
this  section).  Tht  :mage  was  then  degraded  with  noise  as  described  in  Section  4.1.3  using 
two  dropout  rates  and  three  different  Gaussian  variances.  Variances  of  1,  3,  and  7  were 
used  with  a  dropout  probability  of  0.005,  and  a  Gaussian  variance  of  3  was  used  with  a 
dropout  probability  of  0.05.  The  cleanest  LADAR  images  found  when  determining  the 
noise  characteristics  of  the  actual  data  had  a  Gaussian  variance  of  7  for  their  gray  values 
and  dropout  rate  of  0.005.  Note  that  we  have  a  higher  resolution  in  the  z  direction  (in  our 
gray  values),  and  that  a  Gaussian  variance  of  7  in  gray  values  for  us  corresponds  to  an 
error  of  0.29  meters,  which  is  pretty  high  for  the  size  of  the  targets  and  the  kind  of  process¬ 
ing  we  hope  to  do.  For  completeness,  a  variance  of  3  for  us  corresponds  to  0.125  meters, 
and  a  1  gray  value  variance  corresponds  to  0.04  meters. 

Figure  3.3.9  contains  the  original  degraded  downsampled  range  images,  while  Figure 
3.3.10  shows  the  results  of  processing  them  using  5  by  5  processing  windows  and  no  cur¬ 
vature  information  (it  is  too  noisy).  Higher  level  processing  (Section  3.4)  should  be  able  to 
handle  the  low  level  results  of  Figure  3.3.10(a)  &  (b)  with  no  problem,  would  fail 


(a)  Using  Curvature 


(b)  Without  Curvature 


Figure  3.3.8  Surface  label  images. 


(a)  Var  =  1 ,  Prob.  Dropout  =  0.005 


(b)  Var  =  3, Prob.  Dropout  =  0.005 


(c)  Var  =  7,  Prob.  Dropout  =  0.005  (d)  Var  =  3,  Prob.  Dropout  =  0.05 


Figure  3.3.9  Original  degraded  range  images. 


ir  • 


(c)  Var  =  7,  Prob.  Dropout  =  0.005  (d)  Var  =  3.  Prob.  Dropout  =  0.05 


Figure  3.3.10  Labeled  surfaces  for  degraded  images  using  5  by  5  processing  windows  and  no 
curvature  information. 
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miserably  for  (c),  and  may  or  may  not  be  able  to  make  sense  out  of  (d). 

3.3.I.4.  Conclusions  and  Future  Work 

Our  algorithms  and  techniques  work  well  for  reasonable  range  data.  Additional 
preprocessing  may  be  necessary  to  rid  actual  LADAR  of  noise,  and  we  might  not  be  able 
to  use  a  geometric  approach  if  later  LADAR  data  does  not  prove  to  have  the  resolution 
necessary  to  provide  adequate  surface  information.  If  this  is  found  to  be  the  case,  we 
would  be  forced  to  use  the  2-D,  silhouette-based  approach  we  have  been  using  in  our  pro¬ 
cessing  of  FLIR  data.  The  only  advantage  of  using  LADAR  would  be  better  silhouette 
segmentation  immune  to  the  variance  found  in  FLIR  and  other  reflectance  imagery. 

As  far  as  surface  segmentation  goes,  a  region  growing  scheme  may  fare  better  than 
our  present  edge  detection/surface  labeling  scheme.  A  region  growing  approach  would  not 
be  susceptible  to  gaps  in  edges,  since  it  does  not  depend  on  them  explicitly.  With  this 
exploration  of  region  growing  we  will  also  try  using  normal  curvature  instead  of  Gaussian 
and  mean  curvature.  Since  normal  curvature  uses  surface  normal  information  directly,  it 
will  be  able  to  take  advantage  of  our  improved  surface  normal  routines  and  should  not  be 
as  noisy  as  Gaussian  and  mean  curvature.  Also,  since  normal  curvature  is  defined  between 
two  points  it  gives  information  regarding  edge  direction,  whereas  Gaussian  and  mean  cur¬ 
vatures  only  provide  magnitudes  with  no  directional  information. 

In  the  future,  our  low  level  routines  will  be  extended  to  calculate  the  surface  attri¬ 
butes  and  relations  necessary  for  our  high  level  processing,  once  we  have  determined  what 
these  attributed  and  relations  are. 

3.3.2.  New  Low  Level  Processing  Scheme 

We  initiated  our  research  effort  by  applying  our  existing  general  3-D  vision  software 
to  LADAR  data.  As  described  in  [KaYo88,  CroKa87],  this  scheme  entailed  the  generation 
of  (x,y,z)  locations  ( the  range  map  ),  computation  of  surface  normals  and  curvature,  label¬ 
ing  of  edge  pixels,  and  labeling  of  surface  regions,  followed  by  determination  of  surface 
attributes  and  relations  and  invocation  of  domain -specific  classification  rules.  Surface 
extraction  by  first  detecting  edges  and  then  performing  connected  component  labeling  on 
the  remainder  of  the  image  proved  to  be  too  sensitive  to  the  high  variance  noise  and  large 
number  of  dropouts  present  in  actual  LADAR  data.  This  large  amount  of  noise  also  made 
Gaussian  and  mean  curvature  information,  which  is  derived  from  the  second  order  deriva¬ 
tives  of  the  range  map,  relatively  useless. 

Figure  3.3.11  illustrates  our  new  approach  to  surface  extraction.  As  before,  the  first 
step  is  to  convert  the  range  image  into  a  range  map  of  (x,y,z)  locations.  This  is  followed 
by  two  optional  range  map  preprocessing  steps  for  noise  removal,  as  indicated  by  the 
dashed  bubbles  in  the  figure.  These  will  be  presented  in  the  next  section.  Surface  normals 
are  then  computed  using  the  planar  patch  fitting,  minimal  error  assignment  algorithm 


Convert  Range  Image  to  Range  Map 


Median  Filter  Range  Map 


Fit  2-D  B-spline  to  Range  Map 


Compute  Surface  Normals 


m 


Surface  Labels  via  Region  Growing 


m 


Figure  3.3. 1 1  Extraction  of  surfaces  from  range  images. 
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discussed  in  [KaYo88,  KaCh87,  ChKa88].  Finally,  surface  labels  are  assigned  by  a  region 
growing  algorithm  to  be  discussed  in  detail  in  Section  3. 3. 2.2.  As  before,  surface  extrac¬ 
tion  is  to  be  followed  by  determination  of  surface  attributes  and  relations,  and  finally 
object  classification  via  the  high  level  routines. 

3.3.2. 1.  Preprocessing 

The  study  of  the  noise  characteristics  of  the  1986  A.P.  Hill  LADAR  data  reported  in 
[KaYo88]  indicates  that  the  FM  component  of  the  resolved  range  data  may  be  off  as  much 
as  9  meters  on  a  target  at  1400  meters,  and  that  the  AM  portion  is  corrupted  by  Gaussian 
noise  with  a  typical  standard  deviation  of  0.75  meters.  Various  dropout  rates  are  also 
present  in  the  imagery.  Until  a  more  rigorous  noise  model  is  found  for  the  sensor,  proven 
general  purpose  methods  of  removing  noise  must  be  relied  upon. 

One  such  method  is  the  simp'e  application  of  a  2-D  median  filter  to  the  range  map. 
Because  the  output  of  the  median  filter  at  a  given  location  could  be  any  of  the  pixel  values 
within  the  processing  window,  there  was  some  concern  that  the  median  filter  may  destroy 
some  of  the  fine  surface  information  present  in  the  original  range  map.  A  simple  experi¬ 
ment,  presented  in  Section  3.3.2.3,  was  carried  out  in  order  to  determine  how  median 
filtering  the  range  map  affects  the  results  of  later  processing  steps.  Based  on  that  experi¬ 
ment  and  processing  results  for  synthetic  and  actual  LADAR  images,  we  found  that  a  3  by 
3  median  filter  improved  the  ability  of  our  programs  to  successfully  extract  surfaces.  It  is 
therefore  included  as  an  optional  preprocessing  step  in  our  overall  scheme. 

A  second  approach  to  handling  noisy  data  involves  fitting  surfaces  to  it.  We  have 
already  shown  how  fitting  planar  patches  can  be  useful  for  surface  normal  computation. 
However,  real  surfaces  in  real  data  are  seldom  planar.  We  therefore  propose  the  fitting  of 
higher  order  surfaces  to  the  data.  Yang  and  Kak  [YaKa86a,  YaKa86b]  have  shown  how 
2-D  B-splines  may  be  used  to  fit  a  bicubic  surface  patch.  A  bicubic  patch  is  of  third  order, 
guaranteeing  continuity  of  the  first-order  and  second-order  derivatives,  and  so  usually 
yields  the  smoothest  fit  to  the  data  points.  2-D  B-splines  are  fit  to  successive  4  by  4  win¬ 
dows  of  the  range  map.  The  fit  does  not  necessarily  pass  through  the  16  data  values,  but 
merely  uses  them  as  control  points.  At  a  given  window  position  the  fit  is  computed  for  the 
four  points  in  the  center  of  the  window.  Continuity  up  to  the  second  order  derivative  is 
guaranteed  between  the  current  patch  and  the  one  computed  for  the  next  window  position. 
See  [FaPr79]  for  a  discussion  of  the  merib  of  splines  over  other  interpolation  schemes. 

While  2-D  B -spline  interpolation  definitely  yields  a  smooth  range  map,  it  does  not 
remove  the  effects  of  dropouts.  Single  noise  pixels  which  have  values  that  differ  greatly 
from  the  true  distance  to  the  surface  can  cause  large  distortions  in  the  fitted  surface.  It  is 
therefore  desirable  to  remove  dropouts  via  median  filtering  or  median-based  range  bin 
correction  (MBRBC)  [KaYo88]  before  fitting  B-splines.  The  fitting  of  2-D  B-splines  is 
included  as  an  optional  step  in  the  preprocessing  of  the  range  map. 
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Besides  the  using  the  median  filter  and  2-D  B-splines,  we  have  also  tried  processing 
the  AM  component  of  the  data  alone  and  the  "range  bin  corrected"  AM/FM  resolved  range 
data  (3  by  3  MBRBC  in  [KaYo88]).  The  AM  component  of  the  data  is  obtained  by  taking 
mod  1875  of  the  original  AM/FM  resolved  range.  The  magnitude  of  the  noise  in  this  data 
is  considerably  smaller  than  in  the  combined  AM/FM.  The  MBRBC  data  is  basically  a 
median  filter  of  the  FM  component  of  the  data,  and  so  is  an  attempt  to  remove  the  larger 
noise  component  from  the  combined  AM/FM. 

3.3.2.2.  Region  Growing  Approach 

We  have  adopted  a  region  growing  algorithm  for  labeling  surfaces  in  our  range  map 
after  preprocessing  and  surface  normal  computation.  There  are  three  criteria  which  must 
be  satisfied  by  a  neighboring  pixel  b  at  (x,y,z)  location  r*,  in  order  for  it  to  be  included  in 
the  same  region  as  the  current  pixel  a  at  ~ra.  First,  the  neighbor’s  distance  from  the  current 
pixel  must  be  less  than  a  specified  threshold  dst  thr.  Second,  the  angle  between  the 
neighbor’s  unit  surface  normal  n ^  and  the  current  pixel’s  unit  normal  na  cannot  exceed 
angthr.  Finally,  the  normal  curvature  defined  between  the  neighboring  pixel  and  the 
current  one  must  be  less  than  crv  thr.  The  expression  we  use  for  normal  curvature  is 

2sin(  arccosOvty,)  /  2  ) 

K n  =  ~  ” 

\ra~rb\ 

which  is  merely  the  sine  of  half  the  angle  between  the  surface  normals  divided  by  half  the 
distance  between  the  two  pixels.  This  expression  for  normal  curvature  has  been  found  to 
be  less  sensitive  to  noise  than  the  mean  and  Gaussian  curvatures  used  previously 
[ChKa88].  This  is  due  in  part  to  the  fact  that  the  explicit  computation  of  the  second  order 
derivatives  of  the  range  map  is  not  necessary  in  order  to  determine  the  normal  curvature. 

If  all  three  of  the  above  criteria  are  met,  the  neighboring  pixel  is  considered  to  be  part 
of  the  region  to  which  the  current  pixel  belongs.  The  8-neighbors  of  the  current  pixel  are 
considered  for  annexation  to  the  current  region,  resulting  in  8-connected  surfaces.  Our 
implementation  of  the  region  growing  algorithm  is  a  two-pass  labeling  scheme  which  is 
more  efficient  than  the  usual  recursive  implementation.  The  algorithm  also  discards  sur¬ 
faces  made  up  of  fewer  pixels  than  the  processing  window  used  in  surface  normal  compu¬ 
tation.  For  example,  if  a  5  by  5  processing  window  was  used  to  compute  the  surface  nor¬ 
mals,  then  surfaces  containing  less  than  25  pixels  would  be  considered  too  small  and  dis¬ 
carded.  The  following  section  presents  results  obtained  by  this  new  low  level  for  surface 
extraction.  Of  course,  the  criteria  used  to  evaluate  the  processing  results  is  the  examina¬ 
tion  of  how  well  the  extracted  surfaces  compare  with  those  present  in  our  models. 
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3.3.2.3.  Results  on  Synthetic  Data 

As  mentioned  in  Section  3.3.2. 1,  an  experiment  was  run  on  "clean"  synthetic  data  in 
order  to  determine  the  effects  of  median  filtering  range  maps  using  different  sized  process¬ 
ing  windows.  Figure  3.3.12  shows  the  original  PADL  synthetic  LADAR  image  and  differ¬ 
ence  images  for  3  by  3,  5  by  5,  and  7  by  7  median  filter  processing  windows.  The  darkness 
of  a  difference  image  pixel  is  directly  related  to  the  difference  between  the  gray  level  of 
that  pixel  in  the  original  image  and  in  the  median  filtered  image,  thereby  illustrating  which 
pixels  were  affected  and  by  how  much.  Figure  3.3.13  shows  the  surfaces  extracted  for 
each  of  the  four  images.  Each  surface  is  assigned  a  random  color  so  that  it  is  easy  to  dif¬ 
ferentiate  from  its  neighbors.  The  white  blotches  are  made  up  of  surfaces  which  were  dis¬ 
carded  because  they  were  too  small.  The  processing  parameters  and  results  are  summar¬ 
ized  in  Table  3.3.1.  Note  that  the  larger  5  by  5  and  7  by  7  median  filters  destroyed  some  of 
the  edge  information  by  smoothing  out  discontinuities,  causing  adjacent  surfaces  to 
become  merged  together.  The  3  by  3  median  filter  produced  results  comparable  to  those 
obtained  with  no  preprocessing,  and  so  appears  acceptable  for  noise  removal. 

Figures  3.3.14  through  3.3.17  show  the  processing  results  for  synthetic  imagery  with 
various  amounts  of  noise.  In  each  of  these  four  figures  (a)  is  the  original  range  image,  (b) 
is  the  result  obtained  using  the  old  edge  detection/component  labeling  method  of  surface 
extraction  and  no  preprocessing,  and  (c)-(f)  are  the  results  obtained  with  the  new  region 
growing  scheme  and  various  amounts  of  preprocessing.  No  preprocessing  was  done  for 
(c),  B-splines  alone  were  used  for  (d),  median  filter  only  for  (e),  and  (f)  illustrates  median 
filtering  followed  by  fitting  of  B-splines.  The  results  and  processing  parameters  are  sum¬ 
marized  in  Table  3.3.2.  Compare  (b)  &  (c),  the  results  for  the  old  and  new  low  level 
schemes  with  no  noise  removal.  The  only  case  where  they  are  significantly  different  is 
Figure  3.3.16,  the  additive  noise  with  the  highest  Gaussian  standard  deviation.  Here  the 
region  growing  approach  proved  to  be  superior  to  the  edge  detection/component  labeling 
scheme.  Also  note  that  preprocessing  always  helped  extract  surfaces  ((d)  and  (e)  are 
always  better  than  (b)  and  (c)),  but  it  is  also  possible  to  over-preprocess,  as  indicated  by 
the  merged  surfaces  when  both  median  filtering  and  B-spline  interpolation  are  applied  to 
images  with  low  noise.  It  is  interesting  to  note  that  for  the  Figure  3.3.17,  the  one  with  the 
highest  dropout  rate,  the  application  of  both  median  filtering  and  B-splines  produced  the 
best  segmentation  of  any  of  those  attempted  on  the  noisy  synthetic  data. 

3.3.2.4.  Results  on  1987  A.P.  Hill  LADAR  Data 

We  also  tried  our  surface  extraction  algorithms  on  actual  LADAR  data.  Combined 
AM/FM  resolved  range  images,  3  by  3  MBRBC  AM/FM  images,  and  mod  1875  AM  com¬ 
ponent  images  were  processed.  The  results  for  subimages  containing  single  targets  are 
presented  in  Figures  3.3.18  through  3.3.26.  Part  (a)  of  each  of  these  figures  is  the  original 
range  image,  part  (b)  is  the  result  with  no  preprocessing,  (c)  used  B-splines  alone,  (d)  only 


(c)  diff.5  (ci)  diff.7 


Figun.  3.3.12  Effects  of  median  filtering  range  images:  (a)  original  PADL  range  image  (b) 
difference  between  original  and  3  by  3  median  (c)  5  by  5  median  (d)  7  by  7 
median. 


(a)  m60.7x7.nomed 


(c)  m60.7x7.med5  (d)  m60.7x7.med7 


Figure  3.3.13  Effects  of  median  filtering  range  images:  surfaces  extracted  for  (a)  original 
PADL  range  image  (b)  3  by  3  median  (c)  5  by  5  median  (d)  7  by  7  median. 


Table  3.3.1  Summary  of  processing  results  for  median  filtered  PADL  images. 


Processing  Results  for  Median  Filtered  Synthetic  Range  Images 

Preprocessing 

Thresholds 

Filename 

Median 

Fit  2-D 

distance 

angle 

curvature 

Number  of 

Filter? 

B-splines? 

(meters) 

(degrees) 

(1/meters) 

surfaces 

m60.7x7.nomed 

no 

VEPf 

5 

17 

m60.7x7.med3 

no 

mil 

5 

15 

m60.7x7.med5 

no 

0.5 

20 

5 

13 

m60.7x7.med7 

7x7 

no 

0.5 

20 

5 

9 

Table  3.3.2  Summary  of  processing  results  for  noisy  PADL  images. 


Processing  Results  for  Noisy  Synthetic  Range  Images 

Preprocessing 

Thresholds 

Filename 

Median 

Fit  2-D 

distance 

angle 

curvature 

Number  of 

Filter? 

B-splines? 

(meters) 

(degrees) 

(1  /meters) 

surfaces 

sm_m60.nl.l 

no 

no 

mm 

20 

5 

9 

sm_m60.nl.  lb 

no 

yes 

E9 

20 

5 

10 

sm_m60.nl.ini 

yes 

no 

0.5 

20 

5 

35 

sm_m60.nl.  lmb 

yes 

yes 

0.5 

20 

5 

71 

sm_m60.n2.1 

no 

no 

0.5 

20 

5 

8 

sm_m60.n2.1b 

no 

yes 

0.5 

20 

5 

10 
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Figure  3.3.14  Results  of  processing  noisy  PADL  image  with  Gaussian  standard  deviation  =  1 
and  dropout  probability  =  0.005  using  a  5  by  5  processing  window  and  various 
preprocessing. 
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Figure  3.3.15  Results  of  processing  noisy  PADL  image  with  Gaussian  standard  deviation  =  3 
and  dropout  probability  =  0.005  using  a  5  by  5  processing  window  and  various 
preprocessing. 
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Figure  3.3.]  6  Results  of  processing  noisy  PADL  image  with  Gaussian  standard  deviation  =  7 
and  dropout  probability  =  0.005  I'sing  a  5  by  5  processing  window  and  various 
preprocessing. 
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Figure  3.3.17  Results  of  processing  noisy  PADL  image  with  Gaussian  standard  deviation 

=  3 

and  dropout  probability 
preprocessing. 

=  0.05  using  a  5  by  5  processing  window  and  various 
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applied  a  median  filter,  and  (e)  is  median  filtering  followed  by  fitting  B-splines.  Files 
clO_apc  and  clO_tank  were  extracted  from  image  clOml,  while  cl7_apc  was  extracted 
from  cl7ml.  Both  of  the  larger  images  were  processed  in  their  entirety,  and  the  results  are 
presented  in  Figures  3.3.27  through  3.3.36.  Part  (a)  of  these  figures  leal  with  the  com¬ 
bined  AM/FM  resolved  range  image,  while  part  (b)  shows  results  obtained  by  processing 
the  AM  component  alone.  The  clO  targets  are  at  a  range  of  1020m,  while  cl7  targets  are 
1700m  away.  Both  sets  of  images  were  processed  using  7  by  7  processing  windows  for 
surface  normal  computation.  Table  3.3.3  summarizes  the  results  for  the  resolved  range 
LADAR  images,  Table  3.3.4  shows  MBRBC  corrected  AM/FM  image  results,  and  Table 
3.3.5  contains  AM  component  image  statistics. 

The  resolved  range  images  tended  to  be  too  noisy  and  produced  poor  results.  The 
MBRBC  images  did  better,  producing  good  results  for  broad  side  views  of  the  APCs  after 
additional  preprocessing,  but  poor  results  for  the  more  complex  tank  at  a  45  degree  aspect 
angle.  However,  the  mod  1875  images  containing  only  the  AM  component  produced  good 
results  after  a  little  preprocessing.  For  example,  Figure  3.3.21  shows  how  the  turret  and 
front  and  side  surfaces  of  a  tank  were  extracted.  Median  filtering  by  itself  and  followed  by 
fitting  B-splines  proved  most  effective,  while  B-splines  alone  did  a  poor  job  because  of  the 
problem  with  dropouts.  While  mod  1 875  images  produced  good  results,  the  fact  that  there 
is  periodic  wrap  around  from  black  to  white  can  cause  problems.  For  instance,  the  front  of 
clO_apc.AM  is  barely  touching  a  wrap  around  point  and  so  has  lots  of  white  spot  "noise" 
due  to  the  normal  variance  of  the  data.  While  the  median  filter  cleared  most  of  this  up, 
there  is  still  a  hole  in  the  final  result  caused  by  what  remained. 


3.3.2.5.  Surface  Feature  Set  &  Algorithm  Evaluation  Measures 

In  this  section  we  define  a  preliminary  surface  feature  set  in  order  to  bridge  the  gap 
between  the  low  and  high  level  processing  in  our  geometric  approach,  as  well  as  to  provide 
a  rigorous  means  of  evaluating  our  surface  extraction  algorithms.  Our  set  includes: 


(1)  pixel  count,  \S  j ,  the  number  of  pixels  on  surface  S. 

(2)  surface  area,  As ,  in  square  meters. 


(3)  bounding  box,  Bs  = 


min  j*' max 


J  5 


,  defining  the  3-D  extent  of  the  surface  via  its 


minimum  and  maximum  x,  y,  and  z  coordinate  values. 

(4)  centroid  vector  r*s  for  the  surface,  which  may  be  considered  to  be  its  "location."  It  is 
computed  by  averaging  the  (x,y,z)  locations  ~rp  over  the  pixels  p  comprising  the  sur¬ 
face  S. 


(5)  unit  normal  vector  to  the  surface,  ns,  computed  by  averaging  the  normals  of  all  pixels 
belonging  to  the  surface.  It  may  be  considered  as  the  surface  "orientation." 
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Figure  3.3.18  Results  of  processing  resolved  range  image  of  an  ape  at  1020m. 
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Figure  3.3.19  Results  of  processing  MBRBC  range  image  of  an  ape  at  1020m. 
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Figure  3.3.20  Results  of  processing  AM  component  range  image  of  an  ape  at  1020m. 
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Figure  3.3.21  Results  of  processing  resolved  range  image  of  a  tank  at  1020m. 
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Figure  3.3.22  Results  of  processing  MBRBC  range  image  of  a  tank  at  1020m. 
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Figure  3.3.24  Results  of  processing  resolved  range  image  of  an  ape  at  1700m. 
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Figure  3.3.25  Results  of  processing  MBRBC  range  image  of  an  ape  at  170()m. 
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Fig  .re  3.3.26  Results  of  processing  AM  component  range  image  of  an  ape  at  1700m 


Table  3.3.3  Summary  of  processing  results  for  AM/FM  resolved  range  LADAR  images. 


Processing  Results  for  Actual  LADAR  Range  Images 


Filename 

Preprocessing 
Median  Fit  2-D 

Filter?  B-splines? 

distance 

(meters) 

Thresholds 

angle 

(degrees) 

curvature 

(1 /meters) 

Number  of 

surfaces 

clO.AMFM.1 

no 

no 

0.3 

60 

Bi 

2 

clO.AMFM.lb 

no 

yes 

0.3 

60 

MM 

8 

clO.AMFM.lm 

yes 

no 

0.3 

60 

20 

82 

clO.AMFM.lmb 

yes 

yes 

0.3 

60 

20 

98 

C17.AMFM.1 

no 

no 

0.3 

60 

20 

4 

cl7.AMFM.lb 

no 

yes 

0.3 

60 

20 

1 

cl7.AMFM.lm 

yes 

no 

0.3 

60 

20 

39 

cl7.AMFM.lmb 

yes 

yes 

0.3 

60 

20 

43 

clO_apc.AMFM.l 

no 

no 

0.3 

60 

20 

10 

clO_apc.AMFM.lb 

r.o 

yes 

0.3 

60 

20 

5 

c  1 0_apc.  AMFM.lm 

■9 

no 

0.3 

60 

20 

8 

clO  apc.AMFM.lmb 

yes 

0.3 

60 

20 

6 

c  1 0_tank.  AMFM.  1 

no 

no 

0.3 

60 

20 

1 

c  1 0_tank.AMFM.lb 

no 

yes 

0.3 

60 

20 

0 

c  1 0_tank.  AMFM .  lm 

yes 

no 

0.3 

60 

20 

31 

c  1 0_tank.  AMFM.  lmb 

yes 

yes 

0.3 

60 

20 

22 

c!7_apc.AMFM.l 

no 

no 

0.3 

60 

20 

1 

cl7_apc.AMFM.lb 

no 

yes 

0.3 

60 

20 

1 

c  1 7_apc.  AMFM.  lm 

yes 

no 

0.3 

60 

20 

5 

cl7_apc.AMFM.lmb 

yes 

yes 

0.3 

60 

20 

4 

Table  3.3.4  Summary  of  processing  results  for  3x3  MBRBC  of  combined  AM/FM  LADAR 
images. 


Processing  Results  for  MBRBC  LADAR  Range  Images 


Filename 

Preprocessing 
Median  Fit  2-D 

Filter0  B-splines? 

distance 

(meters) 

Thresholds 

angle 

(degrees) 

curvature 

(1 /meters) 

clO_apc.RBC.l 

no 

no 

0.3 

60 

20 

clO_apc.RBC.lb 

no 

yes 

0.3 

60 

20 

clO_apc.RBC.lm 

yes 

no 

0.3 

60 

20 

clO_apc.RBC.lmb 

yes 

yes 

0.3. 

60 

20 

clO_tank.RBC.l 

no 

no 

0.3 

60 

20 

clO_tank.RBC.lb 

no 

yes 

0.3 

60 

20 

clO_tank.RBC.lm 

yes 

no 

0.3 

60 

20 

clO_tank.RBC.lmb 

yes 

yes 

0.3 

60 

20 

cl7_apc.RBC.l 

no 

no 

0.3 

60 

20 

cI7_apc.RBC.lb 

no 

yes 

0.3 

60 

20 

cl7_apc.RBC.lm 

yes 

no 

0.3 

60 

20 

cl7_apc.RBC.lmb 

yes 

yes 

0.3 

60 

20 

Number  of 
surfaces 


Table  3.3.5  Summary  of  processing  results  for  AM  component  of  LADAR  range  images. 


Processing  Results  for  Mod  1875  LADAR  Range  Images 


Filename 

Preprocessing 
Median  Fit  2-D 

Filter?  B-splines? 

distance 

(meters) 

Thresholds 

angle 

(degrees) 

curvature 

(1/meters) 

Number  of 

surfaces 

clO.AM.l 

no 

no 

0.3 

60 

■H 

57 

clO.AM.lb 

no 

yes 

0.3 

60 

SB 

110 

clO.AM.lm 

yes 

no 

0.3 

60 

20 

241 

clO.AM.lmb 

yes 

yes 

0.3. 

60 

20 

249 

C17.AM.1 

no 

no 

0.3 

60 

20 

22 

cl7.AM.lb 

no 

yes 

0.3 

60 

20 

30 

cl7.AM.lm 

no 

0.3 

60 

20 

145 

cl7.AM.lmb 

yes 

0.3 

60 

20 

164 

clO_apc.AM.l 

no 

no 

0.3 

60 

20 

10 

cl0_apc.AM.lb 

no 

yes 

0.3 

60 

20 

15 

clO_apc.AM.lm 

yes 

no 

0.3 

60 

20 

27 

clO_apc.AM.lmb 

yes 

yes 

0.3 

60 

20 

25 

clO_tank.AM.l 

no 

no 

0.3 

60 

20 

25 

clO_tank.AM.Ib 

no 

yes 

0.3 

60 

20 

23 

clO_tank.AM.lm 

yes 

no 

0.3 

60 

20 

44 

clO_tank.AM.lmb 

yes 

yes 

0.3 

60 

20 

32 

cl7_apc.AM.l 

no 

no 

0.3 

60 

20 

10 

cl7_apc.AM.lb 

no 

yes 

0.3 

60 

20 

7 

cl7_apc.AM.lm 

yes 

no 

0.3 

60 

20 

9 

cl7_apc.AM.lmb 

yes 

yes 

0.3 

60 

20 

13 

(b)  clO.AM 

Figure  3.3.27  102()m  L.ADAR  images  (a)  Aivl/FM  resolved  range  (b)  mod  1875  AM  com¬ 
ponent. 


(a)  clO.AMFM.l 


(b)  clO.AM.l 


Figure  3.3.28  Processing  results  for  1020m  LADAR  images,  no  preprocessing  (a)  AM/FM 
resolved  range  (b)  mod  1875  AM  component. 
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(b)  clO.AM.lb 

Figure  3.3.29  Processing  results  for  1020rr.  LADAR  images,  B-splines  only  (a)  AM/FM 
resolved  range  (b)  mod  1875  AM  component. 
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(a)  clO.AMFM.lm 


(b)  clO.AM.lm 

Figure  3.3.30  Processing  results  for  1020m  LADAR  images,  median  filter  only  (a)  AM/FM 
resolved  range  (b)  mod  1875  AM  component. 
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(a)  C17.AMFM 


(b)  cl 7. AM 

Figure  3.3.32  1700m  LADAR  images  (a)  AM/-M  resolved  range  (b)  mod  1875  AM  com 
ponent. 
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(b)  cl7.AM.l 


Figure  3.3.33  Processing  results  for  l'/OOr^  LADAR  images,  no  preprocessing  (a)  AM/FM 
resolved  range  (b)  mod  1875  AM  component. 
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Figure  3.3.34  Processing  results  for  1700m  LADAR  images,  B-splines  only  (a)  AM/FM 
resolved  range  (b)  mod  1875  AM  component. 


(a)  cl7.AMFM.Im 


(b)  cl7.AM.lm 


Figure  3.3.35  Processing  results  for  17(X)m  LAivAR  images,  median  filter  only  (a)  AM/FM 
resolved  range  (h)  mod  1X75  AM  component. 


(b)  cl7.AM.imb 

Figure  3.3.36  Processing  results  for  1700m  I..ADAR  images,  median  filter  &  B-splines  (a) 
AM/FM  resolved  range  (b)  mod  1375  AM  component. 
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(6)  local  planar  fit  mean  square  error,  £/.  When  surface  normals  are  computed  for  each 
pixel  by  fitting  a  planar  patch  to  the  data  points  within  the  processing  window,  the 
mean  square  fit  error  of  the  patch  is  also  computed  and  associated  with  the  pixel.  The 
average  of  this  fit  error  taken  over  the  pixels  comprising  a  surface  is  called  its  local 
planar  fit  error.  It  is  a  measure  of  how  well  the  surface  was  represented  by  its  compo¬ 
site  patches. 

(7)  global  planar  fit  mean  square  error,  Eg ,  measured  between  the  pixels  of  a  surface  S 
and  the  plane  specified  by  its  location  (centroid  ~rs)  and  orientation  (unit  normal  ns). 
This  feature  is  indicative  of  the  overall  planarity  of  the  surface. 

(8)  noise  of  location,  Ln,  is  computed  by  averaging  over  pixels  peS  whose  4-neighbors 
are  all  also  in  surface  S ,  the  distance  in  meters  between  each  pixel  location  7^,  and  the 
centroid  of  its  4- neighbors  ?V 

(9)  noise  of  orientation,  On,  is  the  average  taken  over  pixels  peS  of  the  angle  in  degrees 
between  the  individual  pixel’s  surface  normal  np  and  the  average  normal  ns  of  sur¬ 
face  5. 

Features  1-5  yield  basic  geometric  properties  of  the  surfaces,  while  6-9  indicate 
image  quality  by  measuring  noise.  Features  6  and  7  also  provide  a  limited  idea  of  surface 
shape  by  measuring  local  and  global  planarity.  Surface  relations  (e.g.  region  adjacency) 
and  additional  features  (e.g.  perimeter)  will  be  implemented  in  the  future. 

We  now  define  several  measures  which  will  prove  useful  in  evaluating  our  processing 
algorithms.  In  order  to  determine  how  successfully  we  have  extracted  a  surface  from  an 
image,  we  must  measure  how  it  differs  from  the  ground  truth.  This  ground  truth  informa¬ 
tion  is  obtained  from  the  target  model  after  it  has  been  transformed  to  coincide  with  the 
position  and  orientation  of  the  target  in  the  image.  Each  extracted  target  surface  is  com¬ 
pared  with  its  corresponding  model  surface,  checking  both  its  position  and  orientation. 
We  may  therefore  define  locational  disparity 


z 


as  the  distance  between  the  extracted  surface  location,  specified  by  its  centroid  rs,  and  the 
model  surface  location,  specified  by  its  centroid  T^,  normalized  by  the  range  of  the  surface 
z.  We  take  range  z  to  be  the  z-  component  of  rm.  Orientation  disparity  may  also  be 
defined  as 

1  -  ns  nm 


where  ns  is  the  extracted  surface’s  unit  normal,  specifying  its  orientation,  nm  specifies  the 
model  surface’s  orientation,  and  O ^  e  [0,1]  indicates  the  disparity  between  them.  We  also 
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define  model  fit  error 


as  the  mean  square  fit  error  between  the  (x,y,z)  locations  rp  of  the  pixels  p  in  the  extracted 
surface  S  and  the  plane  specified  by  the  corresponding  model  surface’s  location  ~rm  and 
orientation  nm.  Ef  not  only  determines  how  well  our  extracted  surface  corresponds  to  the 
model  surface,  but  also  may  be  used  as  a  measure  of  the  noise  present  in  the  image. 

3.3.2.6.  Algorithm  Evaluation  Experiment 

In  order  to  evaluate  the  performance  of  our  algorithms,  we  use  as  test  images  the  set 
of  range  images  corresponding  to  the  shaded  images  in  Figure  3.3.37.  These  high  resolu¬ 
tion  range  images  were  generated  using  the  new  ETBM  built  upon  the  TWIN  Solid  Model¬ 
ing  Package,  and  contained  no  additive  noise.  The  images  are  shown  in  increasing  order 
of  complexity.  Figure  3.3.37(a)  is  an  Ml  13  APC  placed  on  a  flat  ground  plane  and  viewed 
from  broadside.  Figure  3.3.37(b)  is  an  Ml  13  on  a  flat  ground  plane  viewed  from  head  on. 
Figure  3.3.37(c)  is  the  more  complex  M60A1  tank  placed  on  a  gently  rolling  ground  plane 
and  viewed  obliquely.  Figure  3.3.37(d)  contains  a  handful  of  relatively  complex  targets  at 
random  aspects  placed  on  rugged  terrain. 

At  the  same  time  TWIN  generates  a  range  image,  it  also  generates  two  ground  truth 
data  files  The  first  of  these  two  files  contains  a  ground  truth  surface  label  image,  where  the 
value  of  each  pixel  in  the  image  is  the  label  of  the  model  surface  to  which  it  belongs.  This 
information  is  illustrated  in  Figures  3.3.38  through  3.3.41.  The  second  file  contains  the 
ground  truth  values  of  geometric  features  \M  ( ,  Am,  Bm,  r*m,  and  nm  for  each  model  surface 
M. 

Our  surface  extraction  routines  were  run  on  this  test  set.  Table  3.3.6a  and  3.3.6b 
summarize  the  surface  features  computed  for  the  Ml  13  APC  of  Figure  3.3.37(b),  and  serve 
as  an  example  of  feature  computation.  Note  that  only  surfaces  comprising  the  target  are 
listed  (8  surfaces  were  found  in  the  entire  image),  and  that  the  surface  labels  listed  in  the 
table  are  extracted  surface  labels  (5,'s),  and  do  not  correspond  to  the  model  surface 
numbers  {M{s)  in  Figure  3.3.39.  Once  surface  features  are  computed,  we  are  ready  to 
evaluate  the  processing  results  by  first  computing  Lj,  Od,  and  Ef  for  each  target  surface 
extracted,  and  then  determining  which  model  surfaces  went  undetected.  In  order  to  com¬ 
pute  the  three  performance  measures  for  extracted  surface  5,  we  must  first  match  it  to  the 
appropriate  model  surface  M.  This  is  accomplished  by  using  the  x  and  y  components  of  rj 
to  index  into  the  model  surface  label  image.  This  approach  is  valid  as  long  as  the  projec¬ 
tions  of  our  object  surfaces  onto  the  image  plane  have  convex  borders,  and  no  projected 
surface  completely  surrounds  another  projected  surface.  This  is  true  for  our  test  data  set. 


Figure  3.3.37  Test  images  for  surface  extraction  algorithm  evaluation  experiment. 


mll3-90.edges 


Figure  3.3.38  Model  surface  labels  for  Ml  1 3  ape  broadside  view  test  image. 


ml  13. edges 


Figure  3.3.39  Model  surface  labels  for  Ml  13  ape  head  on  view  test  image. 


m60a  Ledges 


Figure  3.3.40  Model  surface  labels  for  M60A 1  tank  test  image. 


(a)  complex.edges 


(b)  complex.subl 


Figure  3.3.41  Model  surface  labels  for  complex,  multiple  target  test  image. 


Table  3.3.6a  Nummary  of  "Geometric"  Surface  Features  for  Figure  3.3.37(b). 


Table  3.3.6b  Summary  of  "Image  Quality"  Surface  Features  for  Figure  3.3.37(b). 


Extracted  Target  Surface  Features  -  Part  2 
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Once  all  extracted  surfaces  have  been  matched,  the  remaining  unmatched  model  surfaces 
are  flagged  as  being  undetected.  Tables  3.3.7a-d  contain  the  evaluation  measures  com¬ 
puted  for  our  test  set  and  a  list  of  undetected  surfaces.  Overall,  the  results  are  very  good. 
All  processing  errors  were  caused  by  one  of  the  following  cases: 

i)  surfaces  whose  height  or  width  or  both  were  on  the  order  of  the  processing  window 
dimensions.  This  condition  results  in  higher  Ef  due  to  merged  surfaces  or  moved  pix¬ 
els  in  the  modified  range  map.  The  modified  range  map  becomes  a  factor  because  the 
number  of  border  pixels  (where  most  of  the  moving  takes  place  in  clean  imagery)  is 
close  to  the  total  number  of  pixels  on  the  surface  for  small  surfaces.  Although  Ef 
increases  for  problems  caused  by  small  surfaces,  Od  and  Ld  remain  small  and  good 
estimates  of  nm  and  r*m  are  obtained. 

ii)  small  surfaces  generated  where  objects  meet  ground  plane.  This  problem  occurred 
infrequently  and  depended  on  the  slope  of  the  ground  patch,  the  processing  window 
size,  and  ang  thr.  It  is  easily  recognized  since  multiple  extracted  surfaces  are 
matched  to  the  same  model  surface. 

iii)  overdetailed  object  models.  In  several  cases,  the  target  models  indicate  separate  sur¬ 
faces  where  no  true  edge  in  the  rendered  image  exists,  either  because  of  a  lack  of 
sufficient  jump  discontinuity  or  surface  normal  disparity.  For  instance,  the  distinction 
between  surfaces  15  and  22  in  the  broadside  view  of  the  Ml  13  is  overdetailed  since 
no  sensible  edge  exists  between  them.  High  Ld  and  low  Od  and  Ef  help  identify  an 
object  model  problem. 

3.3.2.7.  Conclusions  and  Future  Work 

In  conclusion,  we  see  that  region  growing  is  an  appropriate  approach  for  surface 
extraction  for  actual  LADAR  data.  As  far  as  preprocessing  is  concerned,  it  is  apparent  that 
median  filtering  is  a  mandatory  step.  We  now  have  in  place  the  tools  needed  to  evaluate 
whether  or  not  the  fitting  of  B-splines  after  median  filtering  improves  the  quality  of  our 
results,  and  whether  the  modified  range  map  helps.  We  have  seen  that  median-based  range 
bin  correction  provided  only  marginal  improvement.  An  examination  of  the  noise  in  the 
1987  A.P.  Hill  data  will  allow  us  to  improve  our  sensor  noise  model  and  aid  in  the  selec¬ 
tion  of  better  preprocessing  methods. 

While  our  2-pass  implementation  of  region  growing  is  more  efficient,  it  may  be  more 
beneficial  to  use  a  standard  recursive  region  growing  procedure  that  can  keep  track  of  the 
aggregate  surface  normal  for  the  surface  it  is  currently  working  on.  A  neighboring  pixel 
would  be  annexed  only  if  it  satisfied  an  additional  fourth  constraint  specifying  that  its  sur¬ 
face  normal  point  in  the  same  relative  direction  as  the  aggregate  surface  normal.  This 
would  allow  the  routine  to  eventually  stop  growing  a  surface  past  an  edge  blurred  by 
preprocessing  or  obscured  by  noise. 


Table  3.3.7a  Summary  of  Evaluation  Measures  for  Figure  3.3.37(a). 

'  Evaluation  Measures  for  Ml  13  APC  Broadside  View 
~S~1  M  Od  Ld  I  Ef 

S3  M15  0  0.000200208  0~ 

Undetected  Model  Surfaces: 

22 


Table  3.3.7b  Summary  of  E  aluation  Measures  for  Figure  3.3.37(b). 


Evaluation  Measures  for  Ml  13  APC  Head-on  View 
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Ef 

S3 

B 
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S5 
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S6 

M4 

0 
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0 

S7 

M3 

0 
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Undetected  Model  Surfaces: 
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Table  3.3.7c  Summary  of  Evaluation  Measures  for  Figure  3.3.37(c). 


Evaluation  Measures  for  M60A1  Tank  Oblique  View 
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13.1433 

S103 
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Undetected  Model  Surfaces: 

2  68  9  10  11  22 


23  38  44  45  46 


Table  3.3.7d  Summary  of  Evaluation  Measures  for  Figure  3.3.37(d). 


Evaluation  Measures  for  Multiple  Target  Image 
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Table  3.3.7d  continued. 


Evaluation  Measures  for  Multiple  Target  Image  (cont.) 
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Figure  3.3.42  Output  of  segmenters  for  target  apl  .32403. 
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Our  evaluation  experiment  has  validated  the  use  of  our  surface  extraction  approach. 
Except  for  a  few  incidences  which  were  explainable  by  three  special  cases,  very  low 
values  for  our  evaluation  measures  were  obtained  on  clean,  truthed  data.  This  indicates 
that  our  algorithms  are  able  to  accurately  extract  surfaces  present  in  LADAR  imagery  and 
do  not  introduce  errors  which  may  corrupt  later  processing.  As  has  been  seen,  the  meas¬ 
ures  provide  a  means  to  evaluate  our  models  and  the  quality  of  our  data  as  well  as  our  sur¬ 
face  extraction  algorithms.  We  are  encouraged  by  the  results  obtained  thus  far,  and  feel 
that  the  pursuit  of  our  current  approach  holds  much  promise. 

3.3.3.  A  Study  of  Five  Laser  Radar  Range  Data  Segmenters 

After  the  targets  are  detected  in  an  image,  the  images  need  to  be  segmented  to 
separate  the  targets  from  each  other  and  from  the  background.  In  this  section  we  examine 
five  different  range  data  segmenters  to  identify  their  strong  and  weak  points  in  segmenting 
laser  radar  range  data.  Although  detection  might  be  possible  using  single  scan  lines,  the 
segmenters  used  here  assume  the  sensor  recorded  the  targets  in  imaging  mode  giving  the 
same  vertical  resolution  as  horizontal. 

Each  of  the  segmenters  was  tested  in  the  following  way.  First,  the  six  range  images 
selected  from  the  A.  P.  Hill  data  set  described  in  Section  3.1.1  were  used.  The  images, 
shown  in  Figures  3. 1.1 -3. 1.6,  were  chosen  to  include  a  variety  of  targets,  ranges,  field  of 
view,  clutter,  and  occlusion.  Each  segmenter  was  run  on  each  image  without  any  prepro¬ 
cessing  (no  filtering,  etc)  or  postprocessing  (like  region  merging  or  finding  largest 
regions).  All  the  segmenters  tested  required  setting  a  threshold,  so  instead  of  trying  to  pick 
a  threshold  (which  might  influence  the  performance  of  the  algorithm),  we  show  the  results 
using  various  thresholds.  Figures  3.3.42-3.3.47  show  the  outputs  of  each  of  the  segmenters 
for  various  thresholds,  and  each  of  the  given  input  images.  Since  the  thresholds  are  based 
on  different  features  for  each  of  the  different  segmenters  it  is  meaningless  to  try  to  com¬ 
pare  images  on  the  same  rows  in  these  figures.  Instead  these  figures  are  to  show  the  sensi¬ 
tivity  of  each  segmenter  to  threshold  selection,  and  the  effects  of  a  misplaced  threshold  for 
each  segmenter  individually.  The  image  in  column  1  is  the  original  unprocessed  LADAR 
image.  The  images  in  columns  2-4  and  8  are  the  output  of  the  given  segmenter  with  white 
representing  background  and  black  being  the  target.  The  images  in  columns  5-7  have 
white  as  the  background  and  the  inverted  image  (black  is  white  and  white  is  black  ,  and 
light  gray  is  dark  gray,  etc)  is  the  target.  The  following  section  discusses  each  of  the  seg¬ 
menters. 

3.3.3. 1.  Planar  Patch  Fitting  Error  Segmenter 

The  motivation  behind  planar  patch  fitting  is  that  objects  to  be  segmented  are  planar 
(or  can  be  approximated  by  planes)  and  the  background  and  clutter  are  not  planar.  Planar 
patch  fitting  segmentation  was  presented  in  [KaYo87],  however  there  were  a  couple  of 


Figure  3.3.42  (continued) 


Figure  3.3.43  (continued) 


Figure  3.3.43  Output  of  segmenters  for  target  ap  1.3241 1. 
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Figure  3.3.44  (continued) 


Figure  3.3.44  Output  of  segmented  for  target  apl. 32504. 


Figure  3.3.45  (continued) 
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Figure  3.3.45  Output  of  segmented  for  target  apl. 32633. 
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Figure  3.3.46  (continued) 


Figure  3.3.46  Output  of  segmenters  for  target  apl. 32837. 


Figure  3.3.47  Output  of  segmented  for  target  ap  1.32839. 
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typographical  errors  in  the  equations  that  set  the  partial  derivative  to  zero,  so  the  derivation 
is  repeated  here.  In  addition  some  simplifications  are  shown  that  can  be  used  when  the 
window  is  symmetric. 

3.33.1.1.  Derivation  of  Planar  Patch  Fitting 

Planar  patches  are  fitted  as  follows:  First  denote  an  M  by  N  range  image  by  z  ( x,y ) 
where  ( x,y )  belongs  to  a  finite  region  D  =  { (x,y)  :0<x<M-\,0<y<N-\}  in  the  xy 
plane.  Next  partition  the  plane  into  overlapping  m  by  n  windows  by  defining  a  region  of 
vertices  consisting  of  V  =  {( u,v ) :  0  <  u  <  M  -  m  +  \t  0<v  <  N  —  n  +  1}  as  shown  is  Fig¬ 
ure  3.3.48.  For  each  vertex  (u,v)  in  V  associate  a  window  of  size  m  by  n  whose  upper  left 
comer  starts  at  (u,v).  The  exhaustive  enumeration  of  the  vertex  set  covers  the  entire  image 
and  provides  a  convenient  structure  for  labeling  all  the  windows  as  illustrated  in  Figure 
3.3.49. 

Since  the  planar  fit  over  each  window  is  similar,  we  will  discuss  fitting  a  plane  over  a 
single  window.  A  plane  is  described  by  the  set  of  all  points  {{ x,y,z ) :  z  =  ax  +  by  +  c  }. 
The  fitting  plane  is  determined  by  minimizing  the  error  square  criterion,  that  is  determine 
( a,b,c )  to  minimize  £  where, 

e=  X  X  ( z(Xi,yj)-aXi-byj-c Y 
i=0  j- 0 

Taking  partial  derivatives  of  £  with  respect  to  a,  b,  and  c, 
de  m~1  n~ 

(z(Xi,yj)  -  aXi  -  byj  -  c) 
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v  =  -2  I  I  yj  (z  (Xi,yj)  -  aXi  -  byj  -  c ) 

00  i= 0  ;=0 

m-ln-l 

t-  =  -2  I  I  (z(xi,yj)  -  axj  -  by}  -  c ) 
ac  1=0 j= 0 

and  setting  the  partial  derivatives  to  zero,  we  obtain 
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a  linear  system  of  equations  to  be  solved  for  ( a,b,c )  in  terms  of  the  range  data  within  the 
window.  This  process  is  repeated  for  each  window  in  the  image. 


3.3.3. 1.2.  Symmetric  Patches 

If  the  window  is  symmetric  (i.e.  -2, -1,0, 1,2,...)  and  yj=(. ..,-2,-1, 0, 1,2,...)) 

the  following  simplifications  can  be  used: 

m-1 

i=0 

n-l 

Z 37  =  0 
j=o 

m-l  n-l 

Z  Z^>  =  o 

«=n  »=o 

Applying  these  to  equation  2. 1  gives: 

m-l  m-l  n-l 

an  £x;  =  £  £xiz(xi,yj) 

i=0  i  =0  j  =0 

m-l  n-l 

X  Z**'  z(xi,yj) 

1=0  ;=0 

a  = - - - - - 

m-l 

nix} 

1=0 

Applying  these  to  equation  2.2  gives: 

n-l  m-l  n-l 

bmJJy)=  X  Z>/Zte07) 


7=0 


b  = 


i=0  7=0 
m-l  n-l 

Z  Xyjztehyj) 

i=0  7=0 


n-l 


mZ>7 

7=0 


And  finally  applying  these  to  equation  2.3  gives: 

m-l  n-l 

Z  Z  z<*i>yj) 

i=0  7=0 


C  = 


mn 


(2.4) 


(2.5) 


(2.6) 


Therefore  a,  b,  and  c  can  be  solved  for  directly  without  any  matrix  inversion. 


3.3.3. 1.3.  Fitting  Error  Segmentation 

The  targets  in  column  2  of  Figures  3.3.42-3.3.47  were  segmented  by  finding  the 
planar  patches  for  an  image  and  eliminating  those  pixels  that  are  in  the  center  of  each 
patch  with  large  fitting  error.  This  method  is  clearly  able  to  locate  all  the  targets  over  a 
wide  range  of  thresholds.  This  is  not  so  much  a  result  of  the  method  being  a  good  one  than 
it  is  that  the  data  was  non-planar  except  where  the  targets  are  located.  Table  3.3.8  gives 
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comments  on  each  of  the  images.  Although  the  images  show  that  this  method  works  well 
on  the  range  data,  it  does  require  a  number  of  computations  for  each  pixel.  Assuming  a 
symmetric  m  by  n  window,  each  window  (and  therefore  each  pixel  since  we  are  using 
overlaying  windows)  requires  2 mn  multiplications,  3mn  additions,  and  3  divisions  to  con> 
pute  a,  b,  and  c,  plus  3 mn  subtractions  and  3 mn  multiplications  to  compute  the  error. 
Finally  an  additional  subtraction  is  needed  to  compare  the  error  to  the  threshold. 

Table  3.3.8  Comments  on  planar  patch  fitting  error  segmentation  results. 


Target 

Comments 

apl. 32403 

At  a  range  of  1km  the  curves  of  the  helicopter  look  like 
planes  therefore  planes  fit  well  to  the  target. 

ap  1.324 11 

The  square  building  (clutter)  to  the  left  of  the  M60A2  was 
nicely  segmented  since  the  planes  fit  to  it  too.  The 
classifier  will  have  to  be  used  to  eliminate  this  clutter  from 
being  considered  a  target.  The  trees  in  the  background  will 
fit  planes  if  too  large  of  a  threshold  is  used. 

apl.  32504 
apl. 32633 

Both  the  different  apparent  sizes  of  the  M60A2  were  easily 
found.  Dropouts  caused  many  holes  in  the  target  if  the  fit 
error  threshold  is  too  small.  Some  of  the  ground  shows  up 
in  these  images.  We  expect  to  see  more  ground  showing 
up  in  the  new  data.  If  this  is  the  case,  the  orientation  of  the 
plane  may  have  to  be  used  to  help  in  segmenting  planar 
targets  from  planar  ground. 

apl. 32837 
apl. 32839 

The  3  by  3  window  size  causes  at  least  a  3  pixel  gap 
between  the  front  target  and  the  occluded  targets.  This 
could  be  a  problem  if  there  are  only  a  few  pixels  on  target. 
The  Rockwell  segmenter  uses  a  novel  approach  to  get 
around  this  problem. 

The  variance  method,  presented  in  the  next  section,  is  able  to  fit  patches  to  planes,  but 
it  uses  fewer  computations. 

3.3.3.2.  Variance  Window  Method 

Suppose  we  still  want  to  find  planar  surfaces,  but  we  know  they  are  always  parallel  to 
the  viewing  plane.  If  this  is  the  case,  a  and  b  in  Equations  2.4  and  2.5  would  always  be 
zero  and  c  in  Equation  2.6  would  be  the  distance  to  the  middle  of  the  plane.  This  distance 
is  simply  the  average  over  the  window  which  we  will  denote  as: 

*  These  operation  counts  do  not  take  into  account  the  additions  and  multiplications  that  might 
be  needed  to  index  into  a  two  dimensional  array. 
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i  m- 1  n-1 
*  = -  £ 


mn 


i  =o  j  =o 


The  error  equation  then  becomes: 


m- 1  n-1 


e=  Z  X  (z(xi,yj)-z) 
i=0  y=o 

which  is  simply  the  variance  of  the  pixels  in  the  window. 


The  variance  method  is  the  planar  patch  fitting  error  method  except  the  planar  patch 
is  always  parallel  to  the  viewing  plane.  Column  3  of  Figures  3.3.42-3.3.47  show  the  out¬ 
put  using  this  method  with  a  3  by  3  window  and  column  4  is  the  output  using  a  5  by  5  win¬ 
dow. 


The  3  by  3  window  results  appear  to  be  as  good  as  those  of  the  planar  patch  fitting 
error  method  with  the  same  sized  window.  Using  a  larger  window  makes  the  variance 
method  less  sensitive  to  noise,  therefore  the  5  by  5  window  is  able  to  segment  out  the  tar¬ 
gets  and  have  fewer  false  segmentations  on  the  background.  As  expected,  the  larger  win¬ 
dow  also  results  in  more  pixels  on  the  edges  of  the  targets  being  missed.  If  the  threshold  is 
too  low,  large  (5  by  5)  holes  are  left  in  the  target. 

The  advantage  of  this  method  is  that  it  requires  fewer  computations  than  the  planar 
patch  fitting  error  method.  total  of  2mn+l  addition/subtractions,  2  multiplications,  2 
divisions,  and  1  square  root  are  required  for  each  pixel. 


3.3.3.3.  Rockwell  Segmenter 

When  working  with  window  operators,  one  likes  to  use  as  large  a  window  as  possible 
to  decrease  the  sensitivity  to  noise.  We  saw  this  effect  in  the  previous  section  where  both 
3x3  and  5x5  windows  were  used.  The  larger  window  picked  up  fewer  false  segmentations 
on  the  background.  The  tradeoff  with  large  windows  is  that  pixels  near  the  edge  of  the  tar¬ 
get  are  lost  and  very  small  targets  may  be  completely  missed,  or  large  window-sized  holes 
are  left  in  the  target  when  there  is  a  noise  spike.  The  3  by  3  window  in  the  variance 
method  left  a  3  pixel  wide  gap  between  the  overlapping  targets  in  ap 1.32837  and 
ap 1.32839 .  while  the  5  by  5  window  left  a  5  pixel  gap. 

The  Algorithm  Development  group  of  Rockwell  International  has  come  up  with  a 
novel  solution  to  this  problem.  They  use  a  large  6  by  6  window,  but  compute  the  variance 
in  eight  3  by  6  windows  around  a  given  pixel.  If  any  of  the  windows  has  a  variance  below 
a  given  threshold,  the  pixel  is  classified  as  the  target.  Figure  3.3.50  shows  the  eight  win¬ 
dows.  The  idea  is  that  the  pixel  might  be  on  an  edge  whose  orientation  is  unknown.  If  this 
edge  passes  through  the  pixel,  one  of  the  eight  windows  will  lie  completely  on  the  target 


*  The  square  root  is  not  needed  if  the  threshold  is  squared 
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Figure  3.3.50  The  eight  windows  around  a  given  pixel  used  in  the  Rockwell  Segmenter. 
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even  if  the  other  half  of  the  window  is  background.  If  any  one  of  the  windows  falls  on  a 
planar  surface,  the  pixel  is  classified  as  planar  (i.e.  a  target). 

Column  5  of  Figures  3.3.42-3.3.47  show  the  output  of  the  Rockwell  segmenter.  Tar¬ 
gets  ap 1.32837  and  apl. 32839  show  that  it  is  able  to  extract  overlapping  targets  without 
leaving  a  gap  between  them. 

The  price  that  is  paid  of  such  good  segmentation  is  the  number  of  computations.  For 
an  m  by  «  window,  the  variance  must  be  computed  over  each  of  the  eight  orientations. 
Each  of  the  subwindow  has  an  area  of  Vvnn.  A  brute  force  method  requires  8('/22mn+l) 
addition/subtractions,  1  multiplication,  16  divisions,  and  8  square  roots.  The  Rockwell 
implementation  takes  note  of  the  overlapping  windows  and  is  able  to  reduce  the  computa¬ 
tions  to  4mn  +40  addition/subtractions,  1  multiplication,  16  divisions,  and  1  square  root. 

3.3.3.4.  Nettleton  Method 

If  speed  is  important,  the  following  algorithm  by  John  Nettleton  of  the  Center  for 
Night  Vision  and  Electro-Optics  performs  well  with  very  few  computations.  His  approach 
is  to  examine  the  value  of  the  pixel  in  the  center  of  the  window  and  compare  it  to  the 
values  of  all  the  pixels  around  it.  If  enough  pixels  are  close  enough  in  value  to  the  center 
pixel,  the  center  pixel  is  classified  as  target.  Typically  for  a  3  by  3  window,  6  neighbors  in 
the  window  must  be  within  the  threshold.  For  a  5  by  5  window  15  neighbors  must  be  close 
enough. 

Columns  6  and  7  of  Figures  3.3.42-3.3.47  show  the  results  for  a  3  by  3  window  and  a 
5  by  5  window.  This  method  does  better  than  any  of  the  other  methods  in  separating  the 
overlapping  targets  in  apl. 32837  and  apl .32839 .  It  leaves  just  single  pixels  between  the 
targets. 

Unfortunately,  since  the  center  pixel  is  the  basis  for  comparison,  it  is  very  sensitive  to 
dropouts  and  noise  spikes.  Notice  that  even  with  the  largest  threshold,  apl .32633  (Figure 
3.3.45)  still  has  many  holes  in  it.  A  simple  improvement  would  be  to  use  the  average 
value  of  the  window  instead  of  the  center  value. 

The  method  is  very  fast  since  it  requires  only  an  addition  and  a  subtraction  to  find  the 
upper  and  lower  ranges  and  mn  subtractions  to  compute  the  differences  between  the  center 
pixel  and  all  the  other  pixels  in  the  window.  If  the  mean  value  were  used  instead  of  the 
center  values,  the  cost  would  be  an  additional  mn  additions,  and  a  division.  Both  methods 
require  an  additional  mn  subtractions  to  compare  the  pixel  differences  to  a  threshold. 

3.3.3.5.  Variance-Less-One  Method 

One  of  the  deficiencies  of  the  planar  patch  fitting  methods  is  that  it  is  sensitive  to 
noise  spikes.  For  example,  Figure  3.3.51  shows  the  six  sample  images  we  have  been 
working  with,  the  variance  of  the  image  for  a  3  by  3  window,  and  the  histograms  for  the 
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variance.  There  is  a  dropout  near  the  front  of  the  leftmost  tank  in  ap 1. 32837 .  This  dro¬ 
pout  caused  the  variances  in  the  nine  windows  which  contain  it  to  be  larger  than  the  vari¬ 
ances  of  the  surrounding  windows.  This  shows  up  on  the  variance  image  as  a  3  by  3 
square  which  is  brighter  than  the  surrounding  pixels.  This  effect  can  be  seen  on  the  other 
targets  as  well.  In  particular,  apl. 32633  has  a  large  number  of  dropouts  near  the  lower  left 
part  of  the  target. 

The  variance-less-one  method  overcomes  this  problem  by  examining  all  the  pixels  in 
a  given  window  and  ignoring  the  pixel  whose  value  is  farthest  from  the  mean  of  the  win¬ 
dow  when  computing  the  variance  for  that  window.  Figure  3.3.52  shows  the  variance¬ 
less-one  images  for  the  same  images  as  in  Figure  3.3.51.  There  are  fewer  “bright 
squares”  caused  by  the  noise  spikes.  The  histograms  show  that,  as  expected,  the  variance 
is  lower  when  the  most  extreme  point  in  each  window  is  omitted.  In  general  the  histo¬ 
grams  appear  to  be  more  bimodal,  and  the  segmented  images  in  column  8  of  Figures 
3.3.42-3.3.47  show  the  targets  have  fewer  holes  in  them  and  that  a  lower  threshold  can  be 
used  and  still  segment  the  entire  target. 

The  computational  complexity  is  the  same  as  the  variance  method,  except  two  addi¬ 
tional  subtractions  are  needed  to  remove  the  given  pixel  from  the  variance  calculation. 

3.3.3.6.  Conclusions 

Five  different  laser  radar  range  image  segmenters  were  tested  and  all  five  performed 
well  on  the  images  they  were  tested  on.  The  image  test  set  c  tained  a  variety  of  targets, 
ranges,  clutter,  and  occlusion. 

The  Nettleton  segmenter  was  very  sensitive  to  noise  spikes  and  dropouts  because  it 
compares  all  the  pixels  in  a  given  window  to  the  center  pixel.  If  the  center  pixel  is  a  noise 
spike  on  a  target,  for  example,  none  of  the  pixels  around  will  be  close  enough  so  it  will  be 
classified  as  a  background  pixel.  An  advantage  of  this  technique  is  that  the  effects  of  a 
noise  spike  are  confined  to  a  single  pixel. 

The  planar  fit,  variance,  and  Rockwell  methods  are  not  as  sensitive  to  noise  spikes  as 
the  Nettleton  method,  however  the  effect  of  a  spike  will  show  up  in  all  the  windows  which 
contain  the  spike. 

Our  variance-less-one  method  overcomes  this  problem  by  finding  the  pixel  which  is 
farthest  from  the  mean  of  a  given  window  and  computing  the  variance  without  that  pixel. 
The  new  method  works  well  with  the  ladar  images  we  currently  have  since  most  windows 
will  contain  one  or  no  noise  spikes.  If  an  image  has  many  spikes  in  a  small  area  (like 
apl. 32633)  a  given  window  may  contain  more  than  one  spike.  This  method  will  only 
remove  the  largest  spike,  leaving  any  others  to  affect  the  variance  calculation.  An 
improvement  to  this  method  would  be  to  ignore  all  spikes  that  deviate  significantly  from 
the  mean.  This  is  a  common  technique  used  in  SAR  image  processing. 


fi 


3-193 


Table  3.3.9  summarizes  the  computational  complexities  of  the  various  methods.  The 
Nettleton  method  is  certainly  the  fastest  method,  with  the  variance,  variance-less-one, 
Rockwell,  and  fitting  error,  methods  following  in  order.  Since  the  dwell  time  for  the  sen¬ 
sor  is  80  p.s  [Rayt]  and  current  signal  processor  chips  can  perform  16  bit  multiplies  [TI85] 
in  200  ns  or  faster,  any  of  these  methods  should  be  able  to  keep  up  with  the  sensor  if 
implemented  on  such  a  chip. 

Table  3.3.9  Summary  of  computational  complexities  of  the  LADAR  segmenters. 


Method 

Additions/ 

Subtractions 

Multipli¬ 

cations 

Divisions 

Square 

Roots 

Fit  Error 

6mn+3 

5mn 

3 

Variance 

2mn+l 

2 

2 

1 

Rockwell 
(brute  force) 

8mn+8 

16 

8 

Rockwell 

(fast) 

4mn+40 

1 

16 

1 

Nettleton 
(center  value) 

2mn+2 

1 

Nettleton 

(average) 

3mn+2 

1 

Variance-less-one 

2mn+3 

2 

2 

1 

3.3.4.  Results  of  Classifying  the  1986  A.P.  Hill  Laser  Range  Data 

The  final  step  in  the  ATR  process,  after  detection  and  segmentation,  is  to  classify  the 
targets.  For  this  experiment  a  set  of  26  -  M60A2  targets  and  26  -  5  ton  trucks  were 
selected  as  show  in  Figure  3.3.53  and  3.3.54.  The  following  sections  discuss  how  the 
classification  was  performed  and  what  the  results  were. 

3.3.4.I.  Target  Segmentation 

The  52  targets  were  segmented  using  the  planar  fitting  error  segmenter  presented  in 
Section  3.3.3. 1.  The  fitting  error  threshold  was  set  by  using  the  automatic  threshold  selec¬ 
tor  which  simply  found  the  the  fitting  error  histogram  and  set  the  threshold  at  the  point 
where  second  derivative  was  zero.  Figures  3.3.55  and  3.3.56  show  the  targets  after  seg¬ 
mentation. 


3.3.4.2.  Feature  Extraction 

Although  many  features  have  been  proposed  for  characterizing  FLIR  data  (over 
twenty  features  were  used  in  the  experiments  in  [KaYo87)),  much  less  work  has  been  done 
for  range  data  [BeJa85].  Since  the  segmenter  was  extracting  well  defined  silhouettes,  we 


Figure  3.3.53  M60A2  targets  used  in  Laser  Radar  classification  experiment. 


Figure  3.3.53  (continued) 


Figure  3.3.54  Five  ton  truck  targets  used  in  Laser  Radar  classification  experiment. 


Figure  3.3.54  (continued) 


m60a2.1.seg 


Figure  3.3.55  Segmented  M60A2  targets  used  in  the  Laser  Radar  classification  experiment 


Figure  3.3.55  (continued) 


5tt.2.seg 
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decided  to  use  the  same  silhouette  based  features  that  were  used  with  the  FLIR  data. 
Tables  3.3.10  and  3.3.1 1  show  the  two  features  sets  that  were  used. 

Table  3.3.10  Shape  features  used  in  Laser  Radar  classification  experiment. 


Feature 

Name 


Area 

Height  to  width  ratio 
Rectangularity 
Width  to  height  ratio 
Height  squared  over  area 


Table  3.3.11  Moment  features  used  in  Laser  Radar  classification  experiment. 


Note  that  no  range  information  used  in  the  features  set,  just  the  outline  data.! 

3.3.4.3.  Classification  Results 

The  classification  experiments  like  those  used  on  the  FLIR  data  in  [KaYo87],  were 
run  on  the  segmented  data.  Table  3.3.12  shows  the  upper  and  lower  bound  estimates  on 
the  classification  of  the  ladar  data. 

Table  3.3.12  Classification  results  on  Laser  Radar  range  data. 


Feature  Set  Lower  Bound  Upper  Bound 


Shape 

Moments 


.9% 

5.7% 

.9% 

1.9% 

t  i.c.  the  moment  functions  were  computed  using  the  value  one  where  the  target  was  and  the  value 
0  where  the  background  was. 
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These  results  show  an  improvement  over  the  FLIR  data.  Remember  that  the  targets 
ranged  from  1  to  3  km  which  is  the  same  range  as  the  FLIR  data.  However,  note  that  there 
were  only  two  target  classes  and  all  the  targets  were  viewed  from  the  same  view  (side 
view),  and  from  the  same  aspect  angle  (0  degrees). 

3.3.4.4.  Conclusions 

With  such  a  limited  set  of  data  we  feel  that  no  definite  conclusions  can  be  drawn  from 
this  experiment.  The  results  do  show  that  when  more  LADAR  data  arrives  we  will  be 
ready  to  conduct  experiments  with  a  more  relistic  selection  of  targets,  ranges,  clutter,  etc. 

3.4.  HIGH  LEVEL  LADAR  PROCESSING 

3.4.1.  Production  Systems  for  Target  Recognition 

This  section  presents  two  different  high  level  solutions  to  the  problem  of  object 
recognition  from  LADAR  images.  Our  long  term  goal  is  to  have  an  autonomous  software 
system  which  will  be  able  to  recognize  a  given  set  of  targets  whenever  these  targets  are 
present  in  LADAR  range  images  in  any  orientation,  combination,  and  number.  The  sys¬ 
tem  should  also  be  able  to  recognize  partially  obscured  targets,  and  to  generate  multiple 
hypotheses  with  confidence  values  where  appropriate. 

During  this  first  series  of  experiments,  the  production  systems  will  have  to  recognize 
four  classes  of  targets:  BMP,  BRDM2,  Ml  13  (all  three  of  which  are  armored  personnel 
carriers),  and  the  M60A1  tank.  As  an  initial  constraint  we  will  only  be  considering  the  sin¬ 
gle  target  aspect  illustrated  in  Figure  3.4.1.  These  targets  may  appear  in  any  number  and 
any  combination  in  a  given  range  image,  and  may  have  missing  or  noisy  surfaces.  It  is 
assumed  that  low  level  processing  has  already  segmented  out  the  surfaces  in  the  scene  and 
computed  their  various  attributes  and  relationships.  Surface  attributes  may  include  loca¬ 
tion,  orientation,  dimensions,  surface  area,  mean  curvature,  and  planar  patch  fitting  error. 
Among  the  relationships  found  between  surfaces  might  be  adjacency,  the  kind  of  edges 
separating  them,  and  whether  they  have  a  convex  or  concave  relationship.  All  of  this 
information  is  output  to  a  file  in  a  format  that  may  be  understood  by  the  expert  systems. 
The  data  provided  as  input  to  the  production  systems  in  this  section  was  hand  extracted 
from  model  information  and  has  had  some  error  added  to  it  in  order  to  illustrate  uncer¬ 
tainty  reasoning  and  recovery  from  low  level  errors.  Of  course,  future  versions  of  our 
software  will  have  to  handle  targets  extracted  from  actual  LADAR  data  at  all  possible 
aspects. 


(d)  M60A1  tank 


Figure  3.4.1  Target  aspect  to  be  recognized  by  high  level  processing. 


3.4.1. 1.  A  Top-Down  Goal-Driven  Approach 

The  first  target  recognition  expert  system  we  developed  is  written  «n  PROLOG.  The 
source  for  the  expert  system  shell  may  be  found  in  Appendix  E.l.  Instead  of  merely 
expressing  our  rules  in  the  usual  conclusion-if-condition  form  PROLOG  uses,  the  expert 
system  shell  is  used  to  provide  a  more  flexible  system.  A  shell  allows  a  user-friendly  inter¬ 
face,  makes  it  possible  to  use  any  format  for  expressing  rules,  keeps  track  of  why  the 
current  line  of  reasoning  is  being  pursued,  provides  for  propagation  of  belief,  and  enables 
the  system  to  show  how  a  conclusion  was  reached,  all  of  which  are  not  done  automatically 
by  PROLOG  alone. 

3.4.1. 1.1.  Facts  and  Rules  for  Solving  Goals 

Our  system  design  is  strongly  influenced  by  the  language  it  is  implemented  in, 
reflecting  the  top-down,  goal-driven  nature  of  PROLOG.  First,  surface  attributes  and  rela¬ 
tionships  found  by  the  low  level  processing  routines  are  read  in  by  the  expert  system  and 
are  considered  to  be  facts.  Figure  3.4.2  is  an  example  of  such  low  level  input  to  the  sys¬ 
tem.  Next,  the  user  specifies  a  goal  like 

target  isa  mll3 

and  the  system  uses  facts  and  rules  to  determine  its  confidence  in  the  goal.  Rules  have  the 
format: 

rulename  :  if 

condition 

then 

conclusion 

with 

strength  (N,  S). 

Condition  is  a  possibly  compound  new  goal  to  solve  to  determine  the  belief  in  the  current 
goal  conclusion.  A  goal  is  said  to  be  compound  if  it  is  a  disjunction  or  conjunction  of 
subgoals.  The  strength  clause  contains  factors  used  by  the  uncertainty  reasoning  scheme. 

The  predicate  explore  (see  Appendix  E.l)  solves  a  goal  by  first  checking  if  it  has 
been  asserted  as  fact,  then  seeing  if  there  are  any  directly  applicable  rules  which  may  be 
used  to  solve  it.  A  rule  is  applicable  if  an  instantiation  of  its  conclusion  matches  the 
current  goal,  and  the  rule  is  used  by  exploring  the  condition  goal  and  propagating  the  evi¬ 
dence  from  this  exploration  using  the  Prospector  model  as  described  below.  If  explore 
cannot  find  a  matching  fact  or  rule  and  the  current  goal  is  compound,  the  subgoals  are 
explored  and  their  results  are  combined  to  determine  the  confidence  in  the  entire  original 
goal.  Finally,  if  the  answer  still  cannot  be  determined,  the  a  priori  probability  of  the  goal 
is  used. 


%%  Low-level  output  for  range  image  containing  BMP  ape. 


%  Surface  Attributes 

% 

%  attr_location (  <surface  id>,  <x  location>,  <y  location>  ) 

%  attr_hwa(  <surface  id>,  <height>,  <width>,  <area>  ) 

%  attr_fit(  <surface  id>,  <fit  error>  ) 

% 

att r_location ( 1 ,  2.3,  1.75). 
attr_hwa(l,  0.5,  0.4,  0.2). 
attr_fit(l,  250) . 
attr_location (2,  3.0,  1.75). 
attr_hwa(2,  0.5,  0.4,  0.2) . 
attr_f  it (2,  342)  . 
attr_location (3,  3.7,  1.75). 
attr_hwa(3,  0.5,  0.4,  0.2). 
attr_f it (3,  501 ) . 
attr_location (4 ,  4.6,  1.75). 
att  r_hwa ( 4 ,  0.07,  1.224,  0.08568). 
attr_f it (4,  186) . 
attr_location (5,  1.35,  1.35). 
attr_hwa(5,  0.4,  2.9,  0.58). 
attr_f it ( 5 ,  603 ) . 
attr_location ( 6,  4.0,  1.25). 
attr_hwa{6,  0.5,  6.5,  2.67). 
attr_f it (6,  23) . 
attr_location (7,  3.0,  0.5). 
attr_hwa(7,  1.0,  6.0,  5.5). 
attr_fit (7,  438)  . 

% 

%  Surface  relations 

% 

%  rel_adjacent (  <surface  id>,  <surface  id>,  <edge  type>  ) 

% 

rel_ad jacent ( 1 ,  2,  snd)  . 
rel_adjacent (2,  3,  snd). 
rel_adjacent (3,  4,  jmp) . 
rel_adjacent ( 5,  6,  snd). 
rel_ad jacent ( 1 ,  5,  jmp) . 
rel_adjace.nt  (3,  6,  jmp). 
rel_ad jacent ( 5 ,  7,  snd). 
rel_ad jacent ( 6,  7,  snd) . 


Figure  3.4.2  Example  BMP  surface  attributes  and  relationships  found  by  low  level  processing 
and  input  to  our  first  expert  system  as  facts. 
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3.4.1. 1.2.  The  Prospector  Model  of  Uncertainty  Reasoning 

The  Prospector  model  [Bratko,  Dudal  is  used  to  propagate  the  evidence  from  the  next 
generation  goal  condition  to  the  confidence  in  the  parent  goal  conclusion ,  and  makes  use 
of  the  strength  factors  N  and  S.  N  ranges  between  0  and  1  and  tells  how  necessary  the 
condition  is  for  the  conclusion ;  if  the  condition  is  false  then  the  lower  N  is  the  less  likely 
the  conclusion  is.  S  tells  how  sufficient  the  condition  is  for  the  conclusion  and  takes  values 
greater  than  1;  if  the  condition  is  true  then  the  higher  S  is  the  more  likely  the  conclusion  is. 
Figure  3.4.3  illustrates  how  probability  is  propagated  using  these  strength  factors,  and  how 
evidence  is  combined  for  compound  goals.  Note  that  posterior  probability  of  the  next  gen¬ 
eration  goal  condition,  which  is  found  using  a  fact  or  another  rule,  is  used  to  determine  the 
multiplier  M  via  the  graph.  This  multiplier  is  then  used  to  change  the  odds  of  the  parent 
goal  conclusion  in  light  of  the  current  evidence.  The  following  discussion  describes  how 
the  a  priori  probabilities  are  derived  and  how  the  system  rules  work. 

3.4.1. 1.3.  Recognition  via  Decomposition  and  Evidence  Accumulation 

Figure  3.3.4  illustrates  the  names  given  to  the  surfaces  composing  each  target.  The 
system  tries  to  find  evidence  for  a  target  by  looking  for  its  constituent  parts.  At  the  first 
level  these  are  turrets,  decks,  and  tracks.  These  are  decomposed  further  until  the  surface 
level  is  reached.  Figure  3.4.5  shows  the  breakdown  of  our  four  targets.  Let  us  call  a  node 
at  any  level  of  the  tree,  including  the  leaves,  a  construct.  The  rules  reflect  the  structure  of 
the  objects,  and  the  system  will  generate  object  goals  and  break  them  down  the  same  way 
the  object  is  decomposed  into  component  constructs.  The  system  uses  the  rules  to  discover 
the  object  structure  as  it  moves  down  the  decomposition  tree.  On  the  way  down  the  tree, 
the  surface  identities  in  this  structure  remain  uninstantiated.  Once  the  leaves  of  the  tree 
are  reached,  the  system  uses  the  facts  asserted  by  the  low  level  processing  to  find  surfaces 
meeting  stated  attribute  requirements.  A  surface  is  acceptable  if  its  attribute  is  within  2 
percent  of  the  required  value.  The  system  now  "unwinds",  moving  back  up  the  tree  with 
these  instantiations  for  surfaces.  Consistency  and  adjacency  checking  is  performed  by  the 
rules  during  this  "unwinding"  process,  and  the  Prospector  model  is  used  to  combine  the 
acquired  evidence. 

The  a  priori  probabilities  are  assigned  using  the  decomposition  tree.  Since  there  are 
four  possible  targets  at  the  top  level,  each  is  assigned  an  equal  a  priori  probability  of  0.25. 
The  remaining  construct  probabilities  are  determined  by  breaking  down  the  top  level  pro¬ 
babilities.  Each  sibling  construct  gets  an  equal  share  of  its  parent's  probability,  as  also 
illustrated  in  Figure  3.4.5.  These  a  priori  probabilities  are  asserted  into  the  database  along 
with  the  rules.  See  Appendix  E.2  for  a  complete  list  of  the  rules  and  probabilities  for  our 
targets. 


E  =  Evidence  (Condition)  H  =  Hypothesis  (Conclusion) 
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(N.S) 


pO(E) 

pO(H) 

pO  =  prior  probability 

P(E) 

p(HIE) 

p  =  posterior  probability 

odds(HIE)  =  M  *  oddsO(H) 
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P(E) 


odds  =  prob/(l-prob) 
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not  E 


1-p 


El  and  E2 


min(pl,p2) 


El  or  E2 
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Logical  relations  and 
combining  evidence  for 
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Figure  3.4.3  How  evidence  is  propagated  using  the  Prospector  model. 


BMP  ape 


1  :  medium_dome_panel 

2  :  medium_dome_panel 

3  :  medium_dome_panel 

4  :  small_gunbarrel 

5  :  med_rear_deck 

6  :  med_f  ore__deck 

7  :  medium  track 


BRDM2  ape 


1  :  small_dome_panel 

2  :  small_dome_panel 

3  :  small_dome_panel 

4  :  lg_rear_deck 

5  :  lg_mid_deck 

6  :  lg_fore_deck 

7  :  top_front 

8  :  bottom_front 

9  :  med_side 

10  :  wheel 

11  :  wheel 


Figure  3.4.4  Identities  of  surfaces  composing  the  targets. 


1  :  large__si.de 

2  :  small  track 


M60A1  tank 


1  :  hatch 

2  :  turret_body 

3  :  turret_front 

4  :  large_gunbarrel 

5  :  small_deck 

6  :  large_track 


Figure  3.4.4  continued. 
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Figure  3.4.5  Tree  showing  object  decompositions  and  a  priori  probabilities. 
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3.4. 1.1.4.  Default  Reasoning 

If  the  system  cannot  find  a  surface  with  the  required  attributes,  we  do  not  want  PRO¬ 
LOG  to  fail.  Instead,  we  want  to  build  up  a  lower  confidence  in  our  hypothesized  target 
class.  For  this  reason,  attribute  value  facts  are  assigned  an  a  priori  probability  of  0.00001. 
This  allows  the  system  to  continue  the  reasoning  process  even  though  a  required  surface  is 
not  present  or  was  corrupted  due  to  noise  or  some  low  level  processing  error.  Later,  we 
can  tell  that  a  given  surface  was  not  found  because  it  remained  uninstantiated. 

The  relative  importance  of  an  object  component  or  surface  is  captured  in  the  strength 
factors  of  the  rule  responsible  for  finding  it.  If  a  construct  is  relatively  unimportant  or 
often  hard  to  find  (e.g.  a  gun  barrel),  N  should  be  set  close  to  1  so  that  the  confidence  in  the 
entire  object  will  not  be  severely  penalized  if  it  is  not  present.  If  a  construct  is  important 
(e.g.  a  turret),  a  high  S  value  will  ensure  that  if  it  is  found  the  confidence  in  the  target  class 
will  increase  greatly,  and  a  low  N  value  will  greatly  decrease  the  confidence  if  it  is  not 
found. 

3.4. L1.5.  An  Example  and  Some  Conclusions 

Figure  3.4.6  is  an  example  session  with  the  expen  system.  Once  inside  PROLOG, 
the  first  step  is  to  read  in  the  expen  system  shell,  its  utility  routines,  and  the  file  containing 
the  rules  and  a  priori  probabilities.  Next,  the  file  containing  the  surface  attributes  and  rela¬ 
tionships  found  by  the  low  level  processing  is  loaded.  In  this  case,  the  input  file  is  simu¬ 
lated  information  for  a  range  image  containing  an  Ml  13  ape.  The  expert  system  shell  is 
asked  whether  an  Ml  13  is  present,  and  it  responds  with  a  confidence  value  (probability)  of 
0.998499.  It  is  also  able  to  show  how  it  arrived  at  this  conclusion.  When  asked  if  an 
M60A1  tank  is  present,  the  program  responds  with  a  confidence  value  of  0.004831. 

We  discovered  some  serious  problems  with  the  high  level  approach,  which  made 
implementing  it  very  educational.  The  primary  source  of  trouble  is  that  it  is  top-down. 
Since  the  user  obviously  doesn’t  know  v -hat  is  in  the  range  image  (if  he  did,  he  probably 
wouldn’t  be  wasting  his  time  talking  to  our  software),  he  is  not  in  a  position  to  ask  the  best 
questions  and  probably  shouldn’t  even  be  consulted.  On  the  other  hand,  the  computer 
doesn’t  know  what  is  in  the  range  image  either,  and  so  must  try  all  possibilities.  This  gives 
us  an  unreasonably  large  search  space  and,  unfortunately,  there  is  no  way  to  prune  it  using 
our  approach.  Once  we  have  found  a  tank  we  cannot  exclude  the  possibility  of  there  being 
an  ape  in  the  image  as  well,  and  we  don’t  even  know  if  we  have  found  all  of  the  tanks  that 
are  there.  The  system  also  is  not  smart  enough  to  find  the  solution  with  the  highest 
confidence  value,  but  rather  returns  the  first  solution  it  finds.  Finding  the  most  probable 
solution  would  require  searching  the  entire  space,  which  would  take  an  unacceptable 
amount  of  time. 

There  are  also  complications  arising  from  the  fact  that  we  allow  a  surface  to  remain 
uninstantiated  in  order  to  handle  low  level  processing  errors,  that  we  never  really  force 


\  Prolog 

C-Proloq  version  1.5 

I  ?-  (expert.,  utilities,  target_rules}  . 
expert  consulted  $144  bytes  1.5  sec. 
utilities  consulted  1100  bytes  0.400001  sec. 
ta rget_rules  consulted  12588  bytes  3.46667  sec. 

yes 

I  ?-  [ ' ml 1 3 . sur f aces' ) . 

ml  1 3 . surf  aces  consulted  280  bytes  0.133334  sec. 
yes 

i  ?-  expert. 

Question,  please: 

I :  target  isa  mll3. 

target  isa  ml 13  :  0.998499 
Would  you  like  to  see  how?  yes. 

target  isa  mll3  :  0.998499  was  derived  by  rule3  from 

large_side ( l )  and  small_ t rack (2)  and  ml  13__ad  jacencies (1, 2)  ;  0.998602  was  derived  from 
la rge_s ide ( 1 )  :  0.998602  was  derived  by  rulel6  from 

elevation (1, 1 . 35)  and  height (1 , 1 . 3)  and  width(l,5.3)  and  area(l,6.2)  and  planar(l)  1  was  derived  from 
elevation (1, 1 . 35)  :  1  was  found  as  a  fact 
and 

height  (1 , 1 . 3)  and  width(l,5.3)  and  area(l,6.2)  and  planar(l)  :  1  was  derived  from 
height (1 , 1 . 3)  :  1  was  found  as  a  fact 
and 

width(l,5.3)  and  area(l,6.2)  and  planar(l)  :  1  was  derived  from 
width (1,5. 3)  :  1  was  found  as  a  fact 
and 

area  (1,6.2)  and  planar(l)  :  1  was  derived  from 
area(l,6.2)  :  1  was  found  as  a  fact 
and 

planar (1)  :  1  was  found  as  a  fact 

and 

small  track(2>  and  ml  1 3_ ad jacencies (1, 2)  :  0.9993  was  derived  from 
sma ll_t rack (2)  :  0.9993  was  derived  by  rulell  from 

el evat ion (2, 0 . 35)  and  height  (2, 0 . 7)  and  width(2,5.3)  and  area(2,3.08)  and  planar{2)  :  1  was  derived  from 
elevation  (2, 0 . 35)  :  1  was  found  as  a  fact 
and 

height.  (2, 0 . 7)  and  width(2,5.3)  and  area(2,3.08)  and  planar{2)  :  1  was  derived  from 
height  (2, 0 . 7)  :  1  was  found  as  a  fact 

and 

width (2, 5 . 3)  and  area(2,3.08)  and  planar(2)  :  1  was  derived  from 
width(2,5.3)  :  1  was  found  as  a  fact 
and 

area (2, 3. 08)  and  planar (2)  :  1  was  derived  from 

area(2,3.08)  :  1  was  found  as  a  fact 
and 

planar(2)  :  1  was  found  as  a  fact 

and 

ml 1 3_ad jacencies ( 1 , 2)  :  0.9993  was  derived  by  rulel9  from 
ad jacent (1, 2, jmp)  :  1  was  found  as  a  fact 


Mere  solutions?  no. 
yes 

I  ?-  expert . 

Question,  please: 

I  :  target  isa  m60al . 

target  isa  rn6Qal  :  0.004831  1  4 
Would  you  like  to  see  how?  no. 

More  solutions?  rio. 

yes 
I  ?- 

i  Pro  loo  execution  halted  ] 

1 


Figure  3.4.6  Example  session  with  first  expert  system. 
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things  to  fail  since  we  are  using  uncertainty  reasoning,  and  that  PROLOG  backtracks  to  the 
most  recent  choice  it  made  in  order  to  explore  other  possibilities.  When  we  look  for  other 
solutions  via  backtracking,  the  system  will  return  all  of  the  possible  permutations  of  the 
current  set  of  surfaces  with  subsets  of  them  uninstantiated  before  proceeding  to  a  new  set 
of  surfaces  making  up  the  next  target.  In  the  same  way,  it  is  also  difficult  to  make  the  sys¬ 
tem  recover  from  errors  in  early  choices.  The  system  tends  to  propagate  a  low  confidence 
value  and  continue  working  if  an  adjacency  constraint  is  not  met,  instead  of  backtracking 
and  making  a  proper  choice  for  a  surface. 

Besides  gross  inefficiencies  arising  from  its  top-down  approach,  the  system  also 
suffers  from  parameters  that  are  difficult  to  set.  Strength  factors  are  determined  heuristi- 
cally  through  experience  in  using  the  system.  The  effect  of  the  factors  is  not  local;  tweak¬ 
ing  the  factors  in  one  rule  affects  the  interaction  with  sibling  rules,  and  side  effects  pro¬ 
pagate  up  and  across  the  tree.  This  makes  the  system  hard  to  adjust,  and  slight  structural 
reconfigurations  of  the  system  require  an  extensive  amount  of  work.  Keeping  all  of  these 
problems  in  mind,  we  had  far  more  success  developing  the  expert  system  described  in  the 
next  section. 

3.4.I.2.  A  Data-Driven  Bottom-Up  Approach 

The  second  target  recognition  expert  system  we  developed  is  written  in  OPS5 
[BroFar].  Like  the  first  one,  it  reflects  the  character  of  the  programming  language  it  is 
implemented  in.  OPS5  is  a  data-driven  language,  and  is  amenable  to  the  bottom-up 
approach  we  wish  to  experiment  with.  We  found  the  natural  structure  of  OPS5  rules  and 
its  flow  of  control  to  be  sufficient  for  our  purposes,  and  so  we  did  not  build  a  shell  on  top 
of  it.  At  this  point  we  wish  to  note  that  the  OPS5  production  system  programming 
language  is  implemented  in  LISP  and  runs  in  the  LISP  environment. 

3.4.L2.1.  Data  Objects,  Rules,  and  the  Inference  Engine 

Before  we  delve  into  the  details  of  the  expert  system,  a  few  words  about  OPS  produc¬ 
tion  system  architecture.  A  data  store,  called  working  memory ,  serves  as  a  global  database 
of  symbols  representing  facts  and  assertions  about  the  problem.  The  data  are  instances  of 
objects,  which  may  represent  either  physical  objects  (or  facts)  related  to  the  domain  of 
application  or  conceptual  objects  (such  as  goals)  related  to  the  problem  solving  strategy. 
An  instantiated  data  object  is  called  a  working  memory  element.  Figure  3.4.7  contains  the 
declarations  of  the  data  object  classes  used  by  the  expert  system.  The  capitalized  word  is 
the  name  of  the  object  class,  and  is  followed  by  attribute  names  associated  with  the  class. 
The  classes  Start  and  Phase  represent  conceptual  objects  used  in  flow  of  control,  while 
World,  Surface,  Adjacent,  and  Construct  represent  physical  objects  or  facts. 

A  set  of  rules  constitutes  an  OPS  program,  and  resides  in  the  production  memory. 
Rule  definitions  have  the  format: 


(literalize  Start) 


;  Element  class  for  initialization. 


(literalize  Phase 

description 


status 


expand,  hypothesize,  build,  or  clean 

expand  :  fill  in  missing  attributes  &  relations 
hypothesize  :  propose  surface  identities 
build  :  construct  higher  level  objects 
clean  :  perform  garbage  collection 
active,  finished 

) 


(literalize  World 
classes 
threshold 


number  of  object  classes 
confidence  threshold 


(literalize  Surface 
id 

x_loc 

y_loc 

height 

width 

depth 

area 

h_to_w 

fit  error 


unique  number  >  0  identifying  surface 
x-coordinate  of  surface  location 
y-coordinate  of  surface  location 
surface  "height"  (x-coordinate  span) 
surface  "width"  (y-coordinate  span) 
surface  "depth"  (z-coordinate  span) 
surface  area 
height  to  width  ratio 
mean  planar  patch  fitting  error 


) 


) 


(literalize  Adjacent 
edge_type 


f  ir  st 
second 


type  of  boundary  between  two  surfaces 

jmp  :  jump  edge  due  to  range  discontinuity 
crv  :  curvature  edge 
snd  :  surface  normal  disparity  edge 
first  surface 
second  surface 

) 


(vector-attribute  surfaces) 

(literalize  Construct 

type  ;  the  name  of  the  construct  (e.g.  bmp,  small_turret, 

;  gun_barrel)  or  flag  value  "output" 

confidence  ;  amount  of  belief  in  existence  of  construct 

surfaces  ;  ordered  list  of  surfaces  in  construct 

) 


Figure  3.4.7  Declarations  of  working  memory  element  classes. 


3-216 


(p  rulename 

(  condition  element  1  ) 

(  condition  element  n  ) 

— > 

(  action  1  ) 

(  action  m  )  ) 

The  condition  part  of  the  rule  is  a  list  of  element  templates  which  are  matched  against  the 
contents  of  working  memory.  The  action  part  of  a  rule  is  a  list  of  instructions  which  may 
modify  the  working  and  production  memories.  The  complete  set  of  rules  for  our  system  is 
listed  in  Appendix  F. 

The  third  component  of  the  OPS  production  system  architecture  is  the  inference 
engine.  It  must  determine  which  rules  are  relevant  to  a  given  working  memory 
configuration  and  choose  one  to  execute.  This  selection  or  control  strategy  is  sometimes 
called  conflict  resolution.  Figure  3.4.8  illustrates  the  production  system  architecture  used 
by  OPS.  The  inference  engine  iterates  through  a  cycle  of  three  action  states.  In  the  first 
state,  MATCH,  the  machine  finds  all  of  the  rules  that  are  satisfied  by  the  current  contents  of 
the  working  memoiy.  The  rule  matchings  that  are  found  are  all  candidates  for  execution, 
and  are  known  collectively  as  the  conflict  set.  The  same  rule  may  appear  in  the  conflict  set 
several  times  if  it  is  satisfied  by  different  sets  of  working  memory  elements.  The  SELECT 
state  applies  some  predetermined  selection  strategy  to  determine  which  rules  in  the  conflict 
set  will  actually  be  executed.  These  rules  are  then  fired  in  the  EXECUTE  state,  which  usu¬ 
ally  produces  some  change  in  production  and/or  working  memory.  Control  then  cycles 
back  to  the  MATCH  state.  The  program  terminates  normally  when  the  conflict  set  is 
empty. 

In  this  system,  control  is  based  on  frequent  re-evaluation  of  the  data  states,  not  on  any 
static  control  structure  of  the  program.  It  therefore  uses  a  data-driven  philosophy.  This 
works  well  with  the  bottom-up  approach.  The  program  has  been  designed  to  operated  in 
phases.  Each  phase  has  its  own  particular  goals  and  uses  a  disjoint  subset  of  the  system 
rules  to  accomplish  them.  Figure  3.4.9  contains  the  flow  of  control  production  rules  which 
perform  the  phase  transitions.  The  currently  active  phase  is  kept  track  of  using  the  Phase 
working  memory  element.  We  will  now  describe  the  individual  phases  in  detail. 

3.4.I.2.2.  Phase  1:  Fill  in  Missing  Attributes  and  Relations 

The  purpose  of  Phase  1  is  to  overcome  several  deficiency  of  OPS,  including  the  ina¬ 
bility  of  OPS  to  compute  items  on  the  left  hand  (condition)  side  of  rules  and  the  lack  of  an 
easy  way  to  deal  with  reflexive  relations.  During  this  "expand"  phase  surface  attributes 


DATA 


RULES 


(working  memory)  (production  memory) 


flow  of  data 
flow  of  control 


Figure  3.4.8  OPS5  production  system  architecture. 


;;  Flow  of  Control  Production  Rules 
(p  Phase-Finished 

{  (Phase  "status  active)  <phase>  ) 

--> 

(modify  <phase>  "status  finished) 


(p  Expand-to-Hypothesize 

(  (Phase  "description  expand  "status  finished)  <phase>  ) 

—  > 

(modify  <phase>  "description  hypothesize  "status  active) 


(p  Hypothesize-to-Build 

{  (Phase  "description  hypothesize  "status  finished)  <pha3e>  ) 

--> 

(modify  <phase>  "description  build  "status  active) 


(p  Build-to-Clean_Up 

(  (Phase  "description  build  "status  finished)  <phase>  ) 

--> 

(modify  <phase>  "description  clean  "status  active) 


(p  Output_Results 

(  (Phase  "description  clean  "status  finished)  <phase>  ) 

--> 

(remove  <phase>) 


;  001 


)  ;  001 
;  002 


)  ;  002 
;  003 


)  ;  003 
;  00  4 


)  ;  004 
;  005 


)  ;  005 


Figure  3.4.9  Flow  of  control  production  rules  performing  phase  transitions. 
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and  relationships  not  explicitly  represented  are  calculated  and  filled  in.  Height-to-width 
ratio,  an  attribute  which  could  be  easily  computed  from  the  height  and  width  attributes  and 
checked  on  the  fly  if  another  programming  language  were  used,  here  must  be  explicitly 
stored  if  it  is  to  be  checked  in  the  precondition  part  of  a  rule.  In  order  to  handle  the  rela¬ 
tion  Adjacent  without  resorting  to  an  excessive  number  of  rules,  it  was  decided  that 
adjacent(B,A)  would  be  asserted  into  working  memory  for  every  occurrence  of 

adjacent(A,B)- 

While  building  up  higher  level  constructs  from  surfaces,  as  will  be  described  later, 
this  system  uses  a  default  surface,  with  id  number  0,  in  place  of  any  surface  which  may  not 
have  been  found  by  the  low  level  processing.  Since  it  is  not  known  at  this  stage  of  the  pro¬ 
cessing  which  surface  or  surfaces  the  default  surface  will  have  to  substitute  for,  Phase  1 
also  asserts  the  adjacency  of  surface  0  with  every  known  surface.  Once  again  we  have 
tried  to  keep  the  rules  simple  and  few  in  number  at  the  expense  of  working  memory  use. 
This  should  allow  the  production  system  to  remain  lucid  while  taking  full  advantage  of  the 
efficient  Rete  pattern  matcher  used  by  OPS.  Figure  3.4.10  contains  example  Phase  1  rules. 

3.4.I.2.3.  Phase  2:  Hypothesize  Surface  Identities 

In  Phase  2  the  expen  system  attempts  to  make  hypotheses  about  the  identities  of  sur¬ 
faces  based  on  their  attributes.  The  surface  names  used  are  the  same  as  those  used  by  our 
first  expert  system,  and  are  given  in  Figure  3.4.4.  If  the  attributes  of  a  surface  are  within 
the  ranges  specified  by  the  preconditions  of  a  hypothesis-generating  rule,  that  surface  is 
asserted  as  a  Construct  consisting  of  the  single  surface.  Say  there  are  n  classes  in  the  par¬ 
ticular  domain  in  which  we  are  working  and  that  the  a  priori  probability  of  the  single¬ 
surface  construct  which  we  are  hypothesizing  is  c,  then  the  confidence  in  our  construct 
using  the  given  surface  is  n*c.  The  reason  for  this  will  be  given  in  the  description  of  the 
next  phase.  The  a  priori  probability  for  a  given  construct  was  asserted  by  the  initialization 
rule  ( Initialize ,  rule  number  999,  see  Figure  3.4.11),  and  it  can  be  easily  recognized 
because  it  consists  of  the  default  surface  (id  number  0). 

A  "clean  up"  rule  that  is  active  at  the  end  of  this  phase  deserves  special  mention. 
This  rule  ( H-Clean-Up ,  rule  number  222)  removes  the  default  constructs  for  which 
hypothetically  valid  surfaces  have  been  found.  This  is  to  avoid  the  use  of  the  default  sur¬ 
face  when  valid  surfaces  exist,  and  thereby  prevent  the  building  up  of  a  large  number  of 
false  hypothetical  constructs,  each  representing  a  different  permutation  of  where  the 
default  surface  could  have  been  used.  The  only  problem  with  this  temporary  fix  is  that  if 
two  instances  of  a  target  class  occur  in  one  image,  one  complete  and  the  other  with  a  miss¬ 
ing  surface,  the  default  is  no  longer  around  to  contribute  to  the  construction  of  the  incom¬ 
plete  target.  The  robust  and  necessary  solution  to  this  nontrivial  problem  is  to  allow  the 
system  to  build  up  the  constructs  using  the  default  surface,  and  then  have  rules  that  recog¬ 
nize  and  discard  these  "more  general"  instantiations  of  the  construct  when  "more  specific" 


;;  Phase  1  Production  Rule3  :  Fill  in  missing  surface  attributes  &  relations 
<p  £xpand-H_to_W 

(Phase  '"description  expand  "status  active) 

(  (Surface  "h_to_«  nil  "height  <h>  "width  <w>)  <s>  ) 

—  > 

(modify  <s>  "h_to_w  (compute  <h>  II  <w>)  ) 


(p  Expand-Ad jacent_l 

(Phase  "description  expand  "status  active) 

-  (Adjacent  "first  0  "second  0) 

—  > 

(make  Adjacent  "edge_type  jmp  "first  0  "second  0) 

(make  Adjacent  "edge_type  snd  "first  0  "second  0) 

(make  Adjacent  "edge_type  crv  "first  0  "second  0) 


102 


102 


(p  Expand-Ad jacent_2 

(Phase  "description  expand  "status  active) 

(Surface  "id  (  <a>  <>  nil  <>  0  i ) 

-  (Adjacent  "first  <s>  "second  0) 

--> 

(make  Adjacent  "edge_type  jmp  "first  <s>  "second  0) 

(make  Adjacent  "edge_type  snd  "first  <s>  "second  0) 

(make  Adjacent  "edge_type  crv  "first  <s>  "second  0) 


(p  Expand-Ad jacent_3 

(Phase  "description  expand  "status  active) 

(Adjacent  "edge_type  <type>  "first  <sl>  "second  <s2>) 

-  (Adjacent  "edge_type  <type>  "first  <s2>  "3econd  <sl>) 

(make  Adjacent  "edge  type  <type>  "first  <s2>  "second  <sl>) 

)  ;  10  4 


Figure  3.4.10  Example  Phase  1  production  rules. 


Rule  to  Initialize  Working  Memory 

When  an  element  of  class  Start  enters  working  memory, 

this  rule  initializes  the  database  of  a  priori  probabilities. 


(p  Initialize 

(  <initialize>  ( 


Sr  a  rt )  ) 


(remove  < init ia lize>) 
(make  Phase  “descripti 
(make  World  “classes  4 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 
(make  Construct  “type 


on  expand  “status  active) 

“threshold  0.5) 

bmp  “confidence  0.25 

medium_track  “confidence  0.08333 

medium_deck  “confidence  0.08333 

med_rear  deck  “confidence  0.04166 

med  foredeck  “confidence  0.04166 

mod iumt ur ret  “confidence  0.08333 

sma 1 l_gunbarrel  “confidence  0.04166 

medium_dome  “confidence  0.04166 

medium_dome_panel  “confidence  0.01389 

brdm2  “confidence  0.25 

small_turret  “confidence  0.0625 

small_dome  “confidence  0.0625 

sma 1 l_dome_panel  “confidence  0.02083 

large_deck  “confidence  0.0625 

lg_rear  deck  “confidence  0.02083 

lg_mid  deck  “confidence  0.02083 

lg_fore_deck  “confidence  0.02083 

car_side  “confidence  0.0625 

med_side  “confidence  0.02083 

wheel  “confidence  0.02083 

car_front  “confidence  0.0625 

top_front  “confidence  0.03125 

bottom_front  “confid'  ce  0.03125 

mll3  “confidence  0.25 

large_side  “confidence  0.125 

sms  I l_t rack  “confidence  0.12b 

m60al  “confidence  0.25 

largeturret  “confidence  0.08333 

hatch  “confidence  0.02083 

turret  body  “confidence  0.02083 

turret_front  “confidence  0.02083 

la rgegunba rre 1  “confidence  0.02083 

small_deck  “confidence  0.08333 

large  track  “confidence  0.08333 


"surfaces  0) 

i 

"surfaces  0) 
"surfaces  0) 
l 

"surfaces  0) 
i 

"surfaces  0) 


"surfaces  0) 
I 

"surfaces  0) 
"surfaces  0) 
"surfaces  0) 

i 

"surfaces  0) 
"surfaces  0) 

I 

"surfaces  0) 
"surfaces  0) 

"surfaces  0) 
"surfaces  0) 


“surfaces  0) 
“surfaces  0) 
“surfaces  0) 
“surfaces  0) 
“surfaces  0) 
“surfaces  0) 


;;  Make  initialization  class  element 
(make  Start) 


Figure  3.4.1 1  Production  rule  that  initializes  working  memory  by  asserting  the  a  priori  prob 
bilities  of  surfaces  and  higher  level  constructs. 
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ones  exist.  Example  Phase  2  production  rules  appear  in  Figure  3.4. 12. 

3.4.I.2.4.  Phase  3:  Build  Higher  Level  Constructs 

This  is  a  very  important  phase,  the  one  in  which  we  will  address  the  assignment  of  a 
priori  probabilities  and  uncertainty  propagation.  Phase  3  uses  the  surface  constructs 
hypothesized  in  Phase  2  to  build  higher  level  constructs  such  as  turrets  and  decks  and, 
eventually,  complete  targets.  Adjacencies  are  checked  as  these  constructs  are  put  together, 
and  the  default  surface  may  be  used  in  place  on  one  that  the  low  level  processing  may  have 
missed.  Figure  3.4.13  contains  example  Phase  3  production  rules. 

Since  there  are  four  possible  targets,  the  a  priori  probability  of  each  object  class  is 
0.25,  and  would  in  general  be  the  reciprocal  of  the  number  of  classes  (1/n).  Figure  3.4.5  is 
a  diagram  showing  how  each  of  the  four  targets  is  broken  down  into  smaller  constructs. 
The  fraction  following  each  construct  node  in  the  tree  is  the  a  priori  probability  of  that 
construct,  and  is  equal  to  the  sum  of  the  a  priori  probabilities  of  its  immediate  children. 
This  is  also  how  evidence  is  propagated  up  the  tree.  The  confidence  value  for  a  higher 
level  construct  is  the  sum  of  the  confidence  values  for  its  constituent  parts.  In  this  particu¬ 
lar  probability  assignment,  all  constituent  constructs  (those  which  are  not  complete  targets 
by  themselves)  were  given  "equal"  weight  in  that  all  sibling  constructs  contribute  an  equal 
amount  of  confidence  to  their  parent  construct.  The  weights  of  these  siblings  could  be 
shifted  to  emphasize  the  relative  importance  or  unimportance  of  a  given  pan,  so  long  as 
the  sum  of  their  confidence  values  remained  the  same  and  the  effect  is  propagated  down 
the  tree  (each  sibling  construct’s  confidence  must  still  equal  the  sum  of  its  immediate 
children’s  confidences).  Figure  3.4.14  is  an  example  of  such  a  weighted  tree. 

At  the  lowest  level  of  the  tree,  surfaces  may  provide  positive  evidence  for  the 
existence  of  a  construct,  but  the  lack  of  a  surface  cannot  disprove  the  existence  of  a  con¬ 
struct,  since  the  surface  may  have  been  missed  because  of  noise  or  some  other  error  in  low 
level  processing.  If  a  surface  is  missing,  the  default  surface  is  automatically  used  in  its 
place  with  the  correct  a  priori  probability.  Notice  that  if  a  target  is  built  up  entirely  of 
default  surfaces,  the  highest  confidence  we  can  have  in  that  target  is  its  a  priori  value, 
which  is  precisely  what  will  be  computed.  However,  if  all  the  constituent  surfaces  were 
found  in  the  image,  their  respec'ive  a  priori  probabilities  would  have  been  multiplied  by 
the  number  of  classes  n  to  obtain  new  confidence  values  (see  Phase  2  description),  which 
would  result  in  a  confidence  value  of  1.0  for  the  whole  target.  The  more  surfaces  we  find 
(or,  in  the  unequally  weighted  construct  case,  the  higher  the  number  of  "important"  sur¬ 
faces  found),  the  higher  our  confidence  in  the  composite  target. 


; ;  Phase  2  Production  Rules  :  Hypothesize  Surface  Identities 


'confidence  <conf>  'surfaces  0) 


(p  H-turret_body 

(Phase  "description  hypothesize  "status  active) 
(World  "classes  <numbe'>) 

(Construct  "type  turre  ,_body 
(Surface  "id  ( 

"y_loc  { 

"height  ( 

"width  ( 

"area  ( 

"fit  error  (  <  1000  ) 


<s>  <> 

>  2. 254 
1.372 
1  .96 
2.744 
1000 


nil  <>  0  ) 

<  2.346  ) 

<  1.428  ) 

<  2.04  ) 

<  2 . 856  ) 


elevation  2.3  +- 
height  1.4  +-  2% 
width  2.0  +-  2% 
area  2.8  +-  2% 
plana  r 


--> 


(make  Construct 


"type  turret_body 

"confidence  (compute  <number>  *  <conf>) 
"surfaces  <s>  ) 


(p  H-large_gunbarrel 

(Phase  "description  hypothesize  "status  active) 
(World  "classes  <number>) 

(Construct  "type  la rge__gunba rrel  "confidence  <conf> 
(Surface  "id  (  <s>  <>  nil  <>  0  ) 

"y_loc  (  >  2.107  <  2.193  ) 

"area  (  >  0.4175  <  0.4345  ) 

"h  to  w  (  >  0.02303  <  0.02397 


"surfaces  0) 

;  elevation  2.15  + 
;  area  0.426  +-  2% 
);  h_to_w  0.0235  +- 
) 


(make  Construct  ~type  la rge_gunba r rel 

"confidence  (compute  <number>  *  <conf>) 

"surfaces  <s>  ) 

)  ; 


(p  H-Clean  Up  ; 

(Phase  "description  hypothesize  "status  finished) 

(  <prior>  (Construct  "type  <t>  "surfaces  0)  i 
(Construct  "type  <t>  "surfaces  (  <>  0  )  ) 

--> 

(remove  <prior>) 

)  ; 


Figure  3.4.12  Example  Phase  2  production  rules. 
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;;  Phase  3  Production  Rules  :  Build  Higher  Level  Constructs 

(p  Build-m60al  ;  312 

(Phase  'description  build  ''status  active) 

(Construct  'type  large_turret  ''confidence  <cl> 

'surfaces  (<sl>  <>  nil!  <s2>  <s3>  <s6>) 

(Construct  'type  small_deck  'confidence  <c2> 

'surfaces  <s4>) 

(Construct  'type  large_track  'confidence  <c3> 

'surfaces  <s5>) 

(Adjacent  'first  <s2>  'second  <s4>  'edge_type  jmp) 

(Adjacent  'first  <s2>  'second  <s5>  'edge_type  jmp) 

(Adjacent  'first  <s3>  'second  <s5>  'edge_type  jmp) 

(Adjacent  'first  <s4>  'second  <s5>  'edge_type  jmp) 

— > 

(make  Construct  'type  m60al 

'confidence  (compute  <cl>  +  <c2>  +  <c3>) 

'surfaces  <sl>  <s2>  <s3>  <s4>  <s5>  <s6>) 

)  ;  312 

(p  B-larg«_turret  ;  313 

(Phase  'description  build  'status  active) 

(Construct  'type  hatch  'confidence  <cl> 

'surfaces  <sl>) 

(Construct  'type  turret_body  'confidence  <c2> 

'surfaces  <s2>) 

(Construct  'type  turret_front  'confidence  <c3> 

'surfaces  <33>) 

(Construct  'type  large_gunba rrel  'confidence  <c4> 

'surfaces  <s4>) 

(Adjacent  'first  <sl>  'second  <s2>  'edge_type  jmp) 

(Adjacent  first  <s2>  'second  <s3>  'edge_type  snd) 

(Adjacent  'first  <s3>  'second  <s4>  'edge_type  jmp) 

--> 

(make  Construct  'type  large_turret 

'confidence  (compute  <cl>  +  <c2>  +  <c3>  +  <c4>) 
'surfaces  <sl>  <s2>  <s3>  <s4>) 

)  ;  313 


Figure  3.4.13  Example  Phase  3  production  rules. 


med  rear  deck  :  4/100 


bmp  :  1/4 
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A  priori  probabilities  with  constructs  weighted  according  to  their  relative  impor¬ 
tance. 


3-226 


kak/yoder 


3.4.1.2.5.  Phases  4  &  5:  Garbage  Collection  and  Output  of  Results 

At  this  point  we  have  done  the  hard  part  of  our  processing.  In  Phase  4  we  throw 
away  those  constructs  whose  confidence  is  less  than  our  likelihood  threshold,  and  then 
clean  up  working  memory.  We  are  done  with  surface  attribute  and  relationship  informa¬ 
tion  at  this  point,  so  all  of  that  excess  baggage  may  be  discarded.  All  that  should  remain  in 
working  memory  are  constructs  we  are  confident  in  and  the  World  element  describing  our 
problem  domain.  Figure  3.4.15  contains  example  Phase  4  production  rules.  Phase  5 
presents  our  results  by  writing  out  each  construct,  our  confidence  in  it,  and  the  list  of  sur¬ 
faces  making  it  up.  The  output  rules  are  listed  in  Figure  3.4.16. 

3.4. 1.2.6.  An  Example 

Figure  3.4.17  shows  a  script  session  of  the  program  in  action.  Once  inside  LISP,  the 
first  step  is  to  load  OPS5,  our  data  object  declarations,  the  system  rules,  and  the  initializa¬ 
tion  rule.  We  then  load  in  surface  data  found  by  our  low  level  processing.  The  file  used  in 
this  example  contains  simulated  data  for  three  targets,  and  is  much  like  that  in  Figure 
3.4.17.  The  first  target  is  a  BRDM2  ape  with  a  missing  surface,  the  second  is  a  complete 
M60AJ  tank,  and  the  *hird  is  another  M60A1  whose  gun  barrel  has  been  broken  up  into  two 
parallel  parts  by  low  level  processing.  The  program  successfully  finds  the  ape,  substituting 
default  surface  0  for  the  missing  surface.  The  complete  tank  is  easily  found  with  high 
confidence,  and  two  possibilities  are  given  for  the  second  tank,  each  using  one  of  the  possi¬ 
ble  gunbarrels  (surfaces  23  &  24).  All  other  objects  and  constructs  had  confidences  that 
were  too  low  to  consider.  The  command  (wm)  examines  the  contents  of  working  memory, 
and  shows  that  only  the  element  defining  the  number  of  object  classes  and  the  confidence 
threshold  remains. 

3.4.1.2.7.  Conclusions 

We  are  very  pleased  with  our  results  using  this  approach,  and  it  is  the  one  we  will 
build  on  in  the  future.  The  data-driven  philosophy  has  proven  itself  superior  to  any  kind  of 
top-down  approach.  Our  second  expert  system  is  much  a  cleaner  program  because  the 
architecture  and  k:hniques  used  arise  naturally  when  one  looks  at  the  problem  from  the 
proper  perspective.  Aside  from  suffering  from  a  few  problems  with  OPS,  the  system  per¬ 
forms  very  efficiently.  It  deals  with  missing  surfaces  and  low  level  processing  errors  in  a 
simple,  noncontrived  fashion,  and  it  easily  finds  all  targets  present  in  the  image.  The 
method  of  target  decomposition  into  constructs  is  a  beautifully  simple  idea,  and  it  reflects 
how  humans  perceive  complex  objects.  The  definition  of  constructs  and  assignment  of 
weighted  a  priori  probabilities  can  be  done  quickly  through  a  user  interface  for  an  object 
modeller,  and  from  there  rules  can  be  generated  automatically. 


;;  Phase  4  Production  Rules  :  Perform  Garbage  Collection 


(p  Clean_Up-Ad jacencies 

(Phase  'description  clean  'status  active) 
(  <adj>  (Adjacent)  ) 

--> 

(remove  <adj>) 


(p  Clean_Up-Sur faces 

(Phase  'description  clean  'status  active) 
(  <surf>  (Surface)  ) 

--> 

(remove  <surf>) 


(p  Clean_Up-Prior 

(Phase  'description  clean  'status  active) 
{  <prior>  (Construct  'surfaces  nil)  ) 

--> 

(remove  <prior>) 


(p  Clean_Up-Unlikely 

(Phase  'description  clean  'status  active) 

(World  'threshold  <thresh>) 

{  <unlikely>  (Construct  'confidence  {<=  <thresh>))  I 

--> 

(remove  <unlikely>) 


Figure  3.4.15  Example  Phase  4  production  rules. 


;  ;  Output  Results 


(p  Output-Type_and_Conf idence 

(  <object>  (Construct  ''type  (  <t>  <>  output  )  'confidence  <c>)  } 

-  (Phase) 

--> 

(write  (crlf )  I  Object  class  :  I  <t>  (crlf )  ) 

(write  (Confidence  :  I  <c>  (crlf)  ) 

(write  | Surfaces  :  I) 

(modify  <object>  ''type  output  'confidence  nil) 

) 


(p  Output-Surfaces 

{  <object>  (Construct  'type  output  '"surfaces  (  <s>  <>  nil  ))  ) 

-  (Phase) 

--> 

(write  (rjust  4)  <s>) 

(bind  <f irst-surface>  (litval  surfaces)) 

(bind  <second-surf ace>  (compute  <f irst-surface>  +  1)) 

(modify  <object>  '"surfaces  (substr  <object>  <second-surf ace>  inf)  nil) 

) 


(p  Output-Complete 

(  <object>  (Construct  'type  output  'surfaces  nil)  ) 

-  (Phase) 

--> 

(write  (crlf) ) 

(remove  <object>) 

) 


r  501 


;  501 
;  502 


;  502 
;  503 

;  503 


Figure  3.4.16  Output  Results  production  rules. 


%  lisp 

Franz  Lisp,  Opus  43.1  [vax  dec/bsd.l] 

(C)  Copyright  1985,1986,1987  Franz  Inc.,  Alameda  Ca . 
=>  (load  'startup) 

;;  Loading  file  "startup" 

;;  Fast  loading  file  " /usr /lib/ lisp/ops5 . o" 

;;  Loading  file  "declarations" 

;;  Loading  file  "rules" 

A**************************************************  .  . 

*t 

=>  (load  'test. all) 

;;  Loading  file  "test. all" 
t 

=>  (run) 


Loading  file  "make" 


Object  class 
Confidence  : 
Surfaces  : 

:  brdm2 

0 . 93739 

3  2  1 

4 

5 

0 

7  8  9  10  11 

Object  class 
Confidence  : 
Surfaces  : 

:  m60al 

0 .99992 

12  13  14 

15 

16 

17 

Object  class 
Confidence  : 
Surfaces  : 

:  m60al 

0 .99992 

18  19  20 

21 

22 

23 

Object  class 
Confidence  : 
Surfaces  : 

:  m60al 

0 .99992 

18  19  20 

21 

22 

24 

end  --  no  production  true 
52  productions  (614  //  1137  nodes) 

520  firings  (721  rhs  actions) 

156  mean  working  memory  size  (284  maximum) 

93  mean  conflict  set  size  (292  maximum) 

476  mean  token  memory  size  (933  maximum) 
nil 

=>  ( wm) 

57;  (World  'classes  4  'threshold  0.5)nil 
=>  (exit) 

o 

o 


Figure  3.4.17  Example  session  with  OPS5  expert  system. 


/  I 

;;  Low-level  output  for  range  image  contai 

r  / 


Surface  Attributes 

(make  Surface  Aid  1  Ax_loc  2.3  Ay 
^height  0.5  Awidth  0.4 
Afit_error  250) 

(make  Surface  Aid  2  Ax_loc  3.0  Ay 
Aheight  0.5  Awidth  0.4 
Afit_error  342) 

(make  Surface  Aid  3  Ax_loc  3.7  Ay 
Aheight  0.5  Awidth  0.4 
Afit_error  501) 

(make  Surface  Aid  4  Ax_loc  4.6  Ay 
Aheight  0.07  Awidth  1.224 
Afit_error  186) 

(make  Surface  Aid  5  Ax_loc  1.35  Ay 
Aheight  0.4  Awidth  2.9 
Afit__error  603) 

(make  Surface  Aid  6  Ax_loc  4.0  Ay 
Aheight  0.5  Awidth  6.5 
Afit_error  23) 

(make  Surface  Aid  7  Ax_loc  3.0  Ay 
Aheight  1.0  Awidth  6.0 
Afit_error  438) 


Surface  Relations 


(make 

Adjacent 

Af irst 

1 

Asecond 

2 

(make 

Adjacent 

Af irst 

2 

Asecond 

3 

(make 

Adjacent 

Af irst 

3 

Asecond 

4 

(make 

Adjacent 

Af irst 

5 

Asecond 

6 

(make 

Adjacent 

Af irst 

1 

Asecond 

5 

(make 

Adjacent 

Af irst 

3 

Asecond 

6 

(make 

Adjacent 

Af irst 

5 

Asecond 

7 

(make 

Adjacent 

Af irst 

6 

Asecond 

7 

ing  BMP  ape. 


_ lo  c  1.75 

Aaiea  0 . 2 

_loc  1.75 
Aarea  0 . 2 

_ loc  1.75 

Aarea  0 . 2 

_ loc  1.75 

Aarea  0.08568 

_ loc  1.35 

Aarea  0.58 

_ loc  1.25 

Aarea  2.67 

_loc  0 . 5 
Aarea  5 . 5 


Aedge_type  snd) 
Aedge__type  snd) 
Aedge_type  jmp) 
Aedge_type  snd) 
Aedge_type  jmp) 
Aedge_type  jmp) 
Aedge_type  snd) 
Aedge_type  snd) 


Figure  3.4.18  Example  low  level  input  to  our  expert  system. 
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3.4.I.3.  Future  Work 

Encouraged  by  our  results,  we  have  many  new  ideas  to  investigate.  The  first  path  we 
wish  to  explore  is  the  automatic  generation  of  all  possible  target  aspects  from  model  infor¬ 
mation.  Once  we  can  generate  aspects  automatically,  the  next  task  will  be  generation  of 
rules  to  recognize  each  of  these  aspects.  We  will  also  be  trying  to  determine  which  attri¬ 
butes  are  important  for  the  recognition  of  specific  object  component  surfaces.  The  attri¬ 
butes  and  relations  used  may  depend  on  the  particular  target  aspect  to  be  recognized,  and 
once  they  are  known  we  will  be  able  to  extend  our  low  level  processing  routines  and 
bridge  the  gap  to  our  expert  system.  We  will  also  need  to  examine  the  unique  properties  of 
LADAR  data,  including  a  study  of  its  noise  characteristics,  and  keep  these  in  mind  as  we 
explore  the  above  ideas.  The  problem  of  determining  the  range  at  which  the  geometric 
approach  breaks  down,  forcing  the  use  of  the  2-D  silhouette  approach,  also  needs  to  be 
addressed. 

3.4.2.  A  Multi-Resolution  Data  Structure  for  Model-Based  Geometric  Reasoning 

The  availability  of  LADAR  sensor  data  brings  wi  e  promise  of  using  geometric 
techniques  to  improve  target  recognition.  Much  of  the  v*ufk  that  has  been  preformed  in 
geometric  reasoning  has  been  in  the  area  of  robot  vision  [Besl88].  In  many  ways  robot 
vision  is  an  easier  domain  to  work  in  because  the  environment  is  much  more  controlled 
than  the  domain  of  LADAR  sensing.  In  a  robot  cell,  the  perceived  size  of  an  object  can  be 
controlled  to  be  almost  constant.  However,  the  perceived  size  of  a  target  in  a  LADAR 
image  can  vary  by  more  than  a  magnitude  of  ten.  (i.e.  the  target  can  be  closer  than  500 
meters,  or  further  than  5000  meters.)  Such  a  wide  range  of  possible  distances  to  the  target 
indicates  that  there  is  also  a  wide  range  in  the  on-target  resolution  that  can  be  used  to  clas¬ 
sify  the  target. 

This  section  presents  a  novel  multi-resolution  data  structure  (called  a  multi-resolution 
aspect  graph )  for  model  based  geometric  classification  of  targets.  Given  a  3D  solid  model 
of  a  target  it  can  automatically  produce  a  hierarchical  representation  of  the  target  ranging 
from  very  low  resolution  to  high  resolution.  The  low  resolution  representation  excludes  all 
structural  information  about  a  target  except  its  silhouette.  Such  information  is  useful  in 
classifying  targets  that  arc  farthest  from  the  sensor.  The  high  resolution  information  can 
give  details  about  the  relative  location,  size,  orientation,  etc,  of  the  all  the  surfaces  and 
edges  on  the  target.  Such  information  is  useful  in  generating  hypotheses  to  verify  the  iden¬ 
tity  of  the  target. 

Section  3.4.2. 1  surveys  a  few  geometric  methods  which  have  been  used  on  LADAR 
data  and  shows  how  the  multi-resolution  aspect  graph  can  enhance  their  performances. 
Section  3.4.2. 2  describes  aspect  graphs  which  are  a  data  representation  that  provide  a 
method  of  selecting  viewpoints  of  a  target  which  see  the  same  features  of  a  given  target. 
Section  3.4.2. 3  presents  the  multi-resolution  aspect  graph  (MRAG)  which  is  a  new  data 
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structure  that  considers  the  resolution  of  the  LADAR  sensor  when  selecting  which  views 
see  the  same  features. 

3.4.2. 1.  Geometric  Based  Target  Recognition 

There  are  numerous  methods  for  using  the  geometric  information  for  object  recogni¬ 
tion  [Besl88j.  Some  approaches  were  presented  earlier  in  this  section  and  [VeWi87] 
presents  another  approach,  all  are  designed  specifically  for  LADAR  data.  The  following 
paragraphs  give  a  brief  summary  of  each  method. 

3.4.2.1.1.  Surface  Segmentation  Approach 

The  Surface  Segmentation  method  presented  in  Section  6.1.1  of  [KaYo88]  classifies  a 
target  based  on  the  type,  shape,  relative  location,  orientation,  area,  etc  of  the  surfaces 
which  comprise  the  target.  Surfaces  types  include  planer,  cylindrical,  spherical ,  and 
unknown.  The  surface  shapes  include  irregular,  trapezoid,  rectangle,  and  square.  A 
bottom-up  rule  based  system  groups  surfaces  which  are  nonconcavely  adjacent  and  nearly 
coplanar  into  objects,  then  higher  level  rules  attempt  to  match  these  objects  to  known 
objects. 

The  main  disadvantage  to  this  system  is  that  very  detailed  surface  information  is 
needed.  It  may  not  be  possible  to  measure  the  curvature  of  a  surface  that  is  more  than  one 
kilometer  away.  Many  surfaces  must  be  found  on  a  given  target  before  it  can  be  accu¬ 
rately  classified.  The  current  LADAR  sensors  may  be  able  to  deliver  such  information  for 
targets  at  less  than  a  kilometer  away,  but  for  targets  at  four  to  five  kilometers,  two  or  three 
planer  surfaces,  at  best,  will  be  able  to  be  identified. 

3.4.2.1.2.  Goal-Driven  Approach 

The  Goal-Driven  Top-Down  approach  presented  in  Section  3.4. 1.1  assumes  the  area 
and  location  of  all  the  surfaces  of  a  target  can  be  located.  With  this  information  it  uses  a 
goal  driven  approach  to  match  the  surfaces  to  a  given  target  using  the  Prospector  Model  of 
Uncertainty  Reasoning  [Bratko,  DudaJ.  The  result  of  the  match  is  a  confidence  value 
showing  how  well  the  surfaces  matched  the  model.  This  process  is  repeated  for  each  pos¬ 
sible  target  in  the  image  and  the  target  is  classified  as  the  model  with  the  highest 
confidence. 

Although  this  system  can  handle  missing  surfaces,  it  still  depends  on  the  bulk  of  the 
surfaces  being  found  before  it  has  significant  confidence  in  its  match.  In  addition,  there 
currently  is  no  automatic  method  for  building  the  probability  tree  (See  Figure  3.4.5).  This 
tree  must  be  rebuilt  if  the  list  of  candidate  targets  changes.  It  also  needs  a  hypothesis  gen¬ 
erating  system  to  select  which  target  models  to  attempt  to  match  to  the  unknown  surfaces, 
otherwise  it  must  compare  the  unknown  surfaces  to  every  known  model. 
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3.4.2.1.3.  Data-Driven  Approach 

The  Data-Driven  Bottom-Up  approach  presented  in  Section  3.4. 1.2  also  assumes  that 
most  of  the  surfaces  of  an  unknown  target  can  be  located  and  certain  features  of  these  tar¬ 
gets  can  be  measured.  Instead  of  trying  to  prove  the  surfaces  are  from  a  certain  target  as 
the  Goal-Driven  method  did,  the  data-driven  method  has  a  library  of  all  possible  surfaces 
which  can  can  be  seen  based  on  the  knowledge  of  all  the  possible  targets  which  can  be 
seen.  It  then  attempts  to  match  each  unknown  surface  to  the  surfaces  in  the  library  and 
build  up  to  the  target. 

This  system  like  the  others  relies  on  being  able  to  identify  most  of  the  visible  surfaces 
on  the  target.  Although  it  can  use  default  reasoning  if  a  couple  of  the  surfaces  are  missing, 
if  the  bulk  of  the  surfaces  cannot  be  found,  this  method  may  fail. 

3.4.2. 1.4.  Appearance  Models 

The  Appearance  Model  approach  [VeWi87]  is  like  the  Surface  Segmentation  method 
from  Section  6.1.1  of  [KaYo87]  in  that  it  must  identify  various  parts  (and  their  relative 
locations)  of  an  unknown  target  in  order  to  identify  the  entire  target.  Figure  3.4.19  shows 
the  appearance  model  of  a  vehicle  which  can  be  a  tank,  a  Howitzer,  or  an  APC.  Each  of 
the  possible  targets  are  then  broken  down  into  smaller  objects.  The  tank,  for  example,  is 
classified  by  finding  the  tank  gun  beside  the  tank  turret  which  has  the  tank  body  below  it 
and  the  tank  antenna  above  it. 

The  major  drawback  of  this  system  is  it  must  be  able  to  identify  fine  features  such  as 
the  tank  gun  and  antenna  in  order  to  identify  the  tank.  In  much  of  the  current  LADAR 
data  only  the  tank  body  (and  maybe  the  turret)  can  be  found  .  The  inability  to  find  the 
other  features  would  hinder  the  Appearance  Model  approach.  The  other  drawback  is  that 
there  is  no  automatic  method  to  create  the  appearance  model. 

3.4.2.1.5.  Summary  of  Geometric  Approaches 

Each  of  these  systems  relies  heavily  on  the  ability  to  accurately  identify  several 
features  (edges,  surfaces,  or  entire  parts  of  the  target  such  as  the  turret  or  main  gun)  of  the 
target.  The  advantage  of  such  an  approach  is  that  if  the  desired  features  can  be  measured, 
the  system  can  make  good  use  of  the  information  to  classify  the  target.  However,  current 
technology  sensors  can  report  such  information  about  targets  at  ranges  of  a  kilometer  or 
less.  However,  LADAR  systems  need  to  be  able  to  classify  targets  that  are  at  least  four  to 
five  kilometers  away.  Although  future  sensors  will  most  likely  be  able  to  deliver  detailed 
data  at  such  distances,  such  data  is  not  available  today.  The  systems  discussed  above 
could  fail  to  classify  the  target  if  the  bulk  of  the  features  cannot  be  identified.  The  multi- 
resolution  aspect  graph  could  be  used  to  enhance  the  above  systems  by:  1)  providing  an 
automatic  means  of  generating  the  probability  trees  needed  by  both  the  Top-Down  and 
Bottom-Up  approaches,  and  2)  automatically  generating  hypothesis  (goals)  for  the  Top- 


p/  p  \  p/ 
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NODES 
V  =  VEHICLE 
T  =  TANK 
H  =  HOWITZER 
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LINKS 

S  =  SPECIALIZATION 
P  -  PART 
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Figure  3.4.19  Region-Based  appearance  model  of  a  vehicle  which  can  be  a  Tank,  Howitzer,  or 
an  APC.  (From  [VeWi87]). 
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Down  system.  Section  3. 4. 2. 2  gives  some  background  information  and  Section  3. 4.2. 3 
presents  the  actual  structure. 

3.4.2.2.  Aspect  Graphs 

The  appearance  of  a  target  varies  greatly  with  the  point  from  which  it  is  viewed. 
Although  the  overall  appearance  of  a  target  will  change  as  the  target  is  simply  rotated 
about  its  axis  through  90°  of  rotation,  the  visibility  of  geometric  features  (such  as  surfaces 
and  edges)  does  not  change  greatly  from  one  view  to  another.  (The  characteristics  of 
features  may  change  as  the  target  turns,  but  the  features  usually  will  not  completely  appear 
or  disappear).  Such  an  observation  has  lead  to  the  creation  of  aspect  graphs. 

An  aspect  graph  [KoDo76]  characterizes  the  possible  viewpoints  from  which  a  target 
can  be  viewed  by  grouping  viewpoints  that  see  the  same  features  into  equivalence  classes. 
A  node  in  the  aspect  graph  corresponds  to  all  the  viewpoints  that  can  observe  the  same 
features.  Aspect  graphs  can  be  generated  analytically  or  by  exhaustively  examining  the 
object.  When  generating  aspect  graphs  exhaustively  the  object  is  centered  within  a  tessel¬ 
lated  viewing  sphere  (with  between  60  and  80  tessels  |HuKa88,  HaHe87])  and  the 
geometric  model  is  viewed  from  each  of  the  tessels.  The  visible  features  in  each  tessel  are 
recorded  and  the  tessels  which  view  the  same  features  are  grouped  together.  Since  the 
LADAR  sensor  is  currently  ground  based,  our  aspect  graphs  are  generated  from  TWIN 
models  by  using  a  viewing  cylinder .  (i.e.  only  the  tessels  corresponding  to  ground  level 
views  are  used.)  Figure  3.4.20  shows  an  Ml  13  as  viewed  from  32  tessels  one  meter  above 
ground  level  at  a  range  of  one  kilometer  and  a  resolution  of  0.05  mrads  between  pixels. 
Since  surfaces  are  the  most  visible  feature  in  the  LADAR  data,  they  are  used  as  the 
geometric  feature  for  generating  the  aspect  graph  (other  features  can  be  used).  Figure 
3.4.21  is  a  list  of  all  the  surfaces  that  are  visible  from  each  viewpoint.  Each  of  the  tessels 
which  see  the  same  surfaces  are  grouped  together  in  Figure  3.4.22.  Each  group  represents 
a  node  in  the  aspect  graph  of  the  Ml  13.  Figure  3.4.23  shows  the  views  corresponding  to 
each  node  in  the  aspect  graph.  The  Ml  13  is  a  “boxy”  object  and  the  model  is  not 
detailed,  so  the  nodes  on  the  aspect  graph  happen  to  represent  eight  equally  spaced  views. 

As  the  number  of  visible  surfaces  in  the  model  increases,  the  number  of  nodes  in  the 
aspect  graph  increases.  Figures  3.4.24  -  3.4.28,  illustrate  this  for  the  M60A1  model.  Fig¬ 
ure  3.4.24  shows  the  32  views  of  an  M60A1;  Figure  3.4.25  shows  each  of  the  surface 
numbers  on  the  M60A1  model;  Figure  3.4.26  lists  the  surfaces  visible  from  each  tessel; 
Figure  3.4.27  groups  the  tessels  which  view  the  same  surfaces;  and  finally,  Figure  3.4.28 
shows  the  corresponding  views  of  the  M60A1  target.  Note  that  the  M60A1  is  a  less  boxy 
target  and  therefore  has  more  surfaces  which  results  in  it  having  twice  as  many  nodes  in  its 
aspect  graph  than  the  MI  13  does.  The  aspect  graphs  of  both  targets  will  have  even  more 
nodes  once  the  BRL-CAD  models  are  converted  to  TWIN  models. 


Tessel  Number 

Visible  Surfaces 

1 

2 

2  3  4  5  14  21  24 

3 

2  34  5  14  21  24 

4 

2  34  5  14  21  24 

5 

2  345  14  21  24 

6 

2  34  5  14  21  24 

7 

2  34  5  14  21  24 

8 

2  34  5  14  21  24 

9 

2  3  14  21  24 

10 

67  89  10  11  12  14  17  1921  24 

11 

67  8  9  10  11  12  14  17  1921  24 

12 

67  89  10  11  12  14  17  1921  24 

13 

67  8  9  10  11  12  14  17  19  21  24 

14 

67  89  10  11  12  14  17  1921  24 

15 

67  89  10  11  12  14  17  1921  24 

16 

67  89  10  11  12  14  17  1921  24 

17 

67  89  10  11  12  14  17  21  24 

18 

67  89  10  11  12  13  17  18  20  22 

19 

67  8  9  10  11  12  13  17  1820  22 

20 

67  89  10  11  12  13  17  1820  22 

21 

678  9  10  11  12  13  17  1820  22 

22 

6789  10  11  12  13  17  18  20  22 

23 

67  89  10  11  12  13  17  1820  22 

24 

67  89  10  11  12  13  17  1820  22 

25 

6  13  17  18  20  22 

26 

2  34  5  13  2022 

27 

2  34  5  13  20  22 

28 

2  34  5  13  20  22 

29 

2  3  4  5  13  20  22 

30 

2  34  5  13  20  22 

31 

2  3  4  5  13  20  22 

32 

2  3  4  5  1 3  20  22 

Figure  3.4.21  Surfaces  of  M 1 1 3  which  are  visible  from  each  viewpoint. 
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67  8  9  10  11  12  14  17  21  24 

18  19  20  21  22  23  24 

67  8  9  10  11  12  13  17  18  20  22 

25 

6  13  17  18  20  22 

26  27  28  29  30  31  32 

2  34  5  13  20  22 

Figure  3.4.22  Figure  3.4.21  with  viewpoints  grouped  together  which  view  the  same  surfaces. 
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Figure  3.4.23  Views  of  Ml  13  that  correspond  to  each  node  in  the  aspect  graph. 
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Visible  Surfaces 

2  3  67  89  10  11  33  34  35  38  4245 
2  3  5  6  7  8  9  10  1 1  21  25  28  33  34  35  36  38  39  42  44  45 
2  3  5  6  7  8  9  10  1 1  21  25  28  33  34  35  36  38  39  42  44  45 
2  3  5  6  7  8  9  10  1 1  21  25  28  33  34  35  36  38  39  42  44  45 
2  3  5  6  7  8  9  10  1 1  21  25  28  34  35  36  38  39  42  44  45 
2  3  5  6  7  8  9  10  1 1  21  25  28  34  35  36  38  39  42  44  45 
2  3  5  6  7  8  9  10  1 1  23  25  28  34  35  36  38  39  42  44 
2  3  5  6  7  8  9  10  1 1  21  25  28  34  35  36  38  39  42  44 
7  21  25  35  36  39  44 

12  13  14  15  16  17  18  20  21  25  28  29  31  35  36  39  40  44 

12  13  14  15  16  17  18  20  21  25  28  29  31  35  36  39  40  44 

12  13  14  15  16  17  18  20  21  25  28  29  31  35  36  39  40  44 

12  13  14  15  16  17  18  20  21  25  28  29  31  35  36  39  40  44 

12  13  14  15  16  17  18  20  21  25  28  29  31  36  39  40  44 

12  13  14  15  16  17  18  20  21  25  28  29  31  36  39  40  44 

12  13  14  15  16  17  18  20  21  25  28  29  31  36  39  40 

12  13  14  15  16  17  18  29  31  32  36  39  4041 

12  13  14  15  16  17  18  19  23  24  29  30  31  32  4041 

12  13  14  15  16  17  18  19  23  24  29  30  31  32  40  41  44 

12  13  14  15  16  17  18  19  23  24  29  30  31  32  40  41  44 

12  13  14  15  16  17  18  19  23  24  29  30  31  32  33  40  41  44 

12  13  14  15  16  17  18  19  23  24  29  30  31  32  33  40  41  44 

12  33  14  15  16  17  18  19  23  24  29  30  31  32  33  40  41  44 

12  13  14  15  16  17  18  19  23  24  29  30  31  32  33  40  41  44 

9  23  24  32  33  41  44 

2  3  4  6  7  8  9  10  1 1  23  24  30  32  33  34  38  41  42  44 
2  3  4  6  7  8  9  10  1 1  23  24  30  32  33  34  38  41  42  44 
2  3  4  6  7  8  9  10  1 1  23  24  30  32  33  34  38  41  42  44  45 
2  3  4  6  7  8  9  10  1 1  23  24  30  32  33  34  38  41  42  44  45 
2  3  4  6  7  8  9  10  1 1  23  24  30  32  33  34  35  38  41  42  44  45 
2  3  4  6  7  8  9  10  1 1  23  24  30  32  33  34  35  38  41  42  44  45 
2  3  4  6  7  8  9  10  1 1  23  24  30  32  33  34  35  38  41  42  44  45 


Figure  3.4.26  Surfaces  of  M06A1  which  arc  visible  from  each  viewpoint. 
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14  15 

12  13  14  15  16  17  18  20  21  25  28  29  31  36  39  40  44 
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28  29 

2  3  4  6  7  8  9  10  1 1  23  24  30  32  33  34  38  41  42  44  45 

30  31  32 

2  3  4  6  7  8  9  10  1 1  23  24  30  32  33  34  35  38  41  42  44  45 

Figure  3.4.27  Figure  3.4.26  with  viewpoints  grouped  together  which  view  the  same  surfaces. 


Figure  3.4.28  Views  of  M60A 1  that  correspond  to  each  node  in  the  aspect  graph. 
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Figure  3.4.28  (continued) 
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Aspect  graphs  themselves  can  provide  a  suitable  data  structure  for  a  model  based 
geometric  reasoning  system.  For  example,  a  classifier  could  be  trained  on  the  silhouettes 
of  the  images  corresponding  to  the  nodes  in  the  aspect  graph.  (See  Figures  3.4.23  and 
3.4.28.)  The  classifier  would  match  an  unknown  target  to  one  or  more  of  the  silhouettes. 
Each  silhouette  corresponds  to  a  given  viewpoint  of  the  3D  model  and  therefore  has 
geometric  information  about  that  view  of  the  target.  This  information  can  be  used  to  gen¬ 
erate  hypotheses  about  the  geometric  features  present  in  the  target.  For  example,  if  the 
unknown  target  matched  Figure  3.4.28  (c),  the  modeler  could  hypothesize  that  if  the  target 
is  an  M60A1,  then  there  must  be  a  90°  edge  in  the  lower  half  of  the  target  and  a  jump  edge 
beiween  the  bottom  half  of  the  target  and  the  top  half. 

Such  an  approach  is  a  workable  solution  if  the  target  models  are  very  simple  like  the 
ones  from  ERIM.  If  more  detailed  targets  are  used,  like  those  from  BRL,  the  number  of 
nodes  in  the  aspect  graph  would  increase  greatly.  Most  of  the  added  details  would  not  be 
visible  in  distant  LADAR  images.  The  following  section  presents  a  method  which  will 
systematically  reduce  the  number  of  nodes  in  the  aspect  graph  (if  needed)  to  match  the 
detectable  features  of  the  target. 

3.4.2.3.  Multi-Resolution  Aspect  Graphs 

The  previous  section  has  shown  how  aspect  graphs  can  take  a  3D  model  of  a  target 
and  break  it  down  into  a  number  of  candidate  views,  each  seeing  a  different  set  of  features. 
If  a  detailed  model  is  used,  the  aspect  graph  will  have  many  nodes  (viewpoints).  This  sec¬ 
tion  presents  a  new  method  for  reducing  the  number  of  nodes  in  an  aspect  graph  so  that 
detailed  models  can  still  be  used  to  classify  distant  targets  that  have  few  pixels  on  target. 

It  is  possible  for  a  small  feature  in  an  object  (a  feature  so  small  the  LADAR  sensor 
can  not  see  it),  to  cause  additional  (possibly  unneeded)  nodes  to  appear  in  the  aspect  graph. 
The  multi-resolution  aspect  graph  is  an  extension  of  the  aspect  graph  which  systemati¬ 
cally  removes  features  from  consideration  when  building  the  aspect  graph  so  that  the  graph 
is  not  influenced  by  features  the  sensor  will  never  be  able  to  see.  The  following  two  sec¬ 
tions  discuss  how  to  build  a  multi-resolution  aspect  graph  based  on  visible  surface  area  and 
the  angles  between  two  surfaces. 

3.4.2.3.I.  Surface  Area  Based  Multi-Resolution  Aspect  Graph 

Figure  3.4.29  is  a  list  of  all  the  surfaces  in  the  M60A1  model.  The  numbers  in  the 
Real  Area  column  represent  the  actual  uiea  of  the  surface  as  measured  on  the  3D  model. 
The  number  in  the  Viewed  Area  column  are  the  number  of  pixels  covered  by  the  surface 
when  it  is  projected  onto  a  2D  plane  assuming  a  0.05  mrad  resolution  and  a  distance  of  one 
kilometer.  Since  this  area  changes  with  viewpoint,  the  maximum  area  seen  from  all  of  the 
tessels  is  the  one  recorded.  Notice  that  surface  29  has  the  largest  real  area ,  but  it  has  one 
of  the  smaller  viewed  area  s.  This  is  because  it  is  the  surface  on  the  bottom  of  the  tank. 


Surface 

Number 

Real  Area 
( meters  2) 

Viewed  Area 
(pixels) 

2 

0.26793 

60.00 

3 

0.26793 

56.00 

4 

0.34750 

129.00 

5 

0.34750 

129.00 

6 

3.16716 

440.00 

7 

0.88122 

315.00 

8 

0.62581 

105.00 

9 

0.88122 

295.00 

10 

0.62581 

98.00 

11 

2.03758 

660.00 

12 

0.21110 

56.00 

13 

0.21110 

60.00 

14 

0.69235 

266.00 

15 

0.61492 

126.00 

16 

0.69235 

285  00 

17 

0.61492 

135.00 

18 

3.08000 

1232.00 

19 

0.12650 

46.00 

20 

0.12650 

46.00 

21 

0.70000 

280.00 

23 

0.70000 

280.00 

24 

9.92135 

3992.00 

25 

9.92135 

3991.00 

28 

2.61005 

373.00 

29 

13.27565 

220.00 

30 

2.61005 

373.00 

31 

2.34000 

935.00 

32 

2.83907 

1120.00 

33 

1.48808 

576.00 

34 

0.61393 

244.00 

35 

1.48808 

576.00 

36 

2.8396" 

1120.00 

38 

2.4;' 

564.00 

39 

0.37112 

135.00 

40 

0.63643 

110.00 

41 

0.37112 

132.00 

42 

0.36450 

135.00 

44 

0.16302 

142.00 

45 

0.00707 

2.00 

Figure  3.4.29  Actual  and  maximum  viewed  sizes  of  the  surfaces  in  the  M60A1  model. 
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Although  it  is  the  largest  surface,  only  a  small  pan  of  it  is  ever  visible  at  a  given  time. 
Surface  number  45  is  the  end  of  the  main  gun  of  the  M60A 1 .  Although  its  viewed  area  is 
only  two  pixels,  its  presence  in  tessels  5  and  6  (Figure  3.4.27)  and  absence  in  tessels  7  and 
8  is  enough  to  cause  those  tessels  to  be  in  different  nodes  on  the  aspect  graph.  This  surface 
would  most  likely  never  be  seen  by  a  LADAR  sensor,  and  should  therefore  not  cause  the 
aspect  graph  to  have  two  nodes  differing  by  only  it. 

In  the  example  above,  the  features  are  the  surfaces  of  the  target.  The  surfaces  can  be 
ordered  by  their  largest  viewed  area  and  those  surfaces  whose  area  is  too  small  to  be 
detected  at  the  given  resolution  will  not  be  considered  when  building  the  aspect  graph. 
Suppose  at  the  given  resolution  the  smallest  reliably  detectable  surface  is  575  pixels  (i.e. 
the  number  of  pixels  that  can  be  viewed)  ,  from  Figure  3.4.29  it  is  seen  that  only  surfaces 
11,  18,  24,  25,  31,  32,  33,  35,  and  36  would  be  considered  in  building  the  aspect  graph. 
Figure  3.4.30  is  a  list  of  the  nodes  in  the  reduced  resolution  aspect  graph,  and  Figure 

3.4.31  shows  the  corresponding  views.  The  aspect  graph  has  been  simplified  to  match  the 
resolution  of  the  sensor. 

3.4.2.3.2.  Edge  Angle  Based  Multi-Resolution  Aspect  Graph 

Edges  are  another  distinctive  feature  of  LADAR  imagery  that  can  be  used  for  target 
classification.  There  are  two  types  of  edges  which  can  appear  in  a  range  image,  viewpoint 
dependent  edges  and  viewpoint  independent  edges.  Viewpoint  dependent  edges  are  gen¬ 
erally  those  edxes  which  appear  between  the  target  and  the  background.  For  example  the 
edges  between  the  turret  and  the  terrain  in  the  background.  The  model  itself  gives  all  the 
viewpoint  independent  edges,  and  these  are  the  edges  which  will  be  used  here.  Figure 

3.4.32  lists  all  the  viewpoint  independent  edges  aiong  with  their  lengths  and  the  angle 
between  the  normals  of  the  two  surfaces  which  meet  at  the  edges  in  the  M60A1  model. 
Figure  3.4.33  (a)  plots  the  count  of  the  number  of  edges  with  the  same  angle  and  Figure 

3.4.33  (b)  plots  the  count  of  the  number  of  edges  with  the  same  length.  It  is  easy  to  see 
that  most  of  the  edges  are  90°.  This  is  expected  to  change  when  more  detailed  models  are 
available  since  most  targets  are  not  so  “boxy”.  Figure  3.4.34  shows  a  plot  of  length  vs 
angle  for  each  of  the  models  used.  These  again  show  that  in  these  models,  most  of  the 
edges  are  90°. 

The  edge  information  can  be  used  to  adjust  the  model  to  the  sensor  resolution  by 
observing  that  edges  with  small  angles  between  surfaces  are  more  difficult  to  detect  than 
edges  with  large  angles.  Also  short  edges  are  more  difficult  to  detect  than  long  edges.  Our 
approach  is  to  remove  all  edges  whose  length  is  less  than  a  given  threshold  and  whose 
angle  is  less  than  a  given  threshold.  The  thresholds  used  must  be  determined  by  the  reso¬ 
lution  of  the  sensor.  Since  such  information  is  not  currently  available,  Figure  3.4.35  - 
3.4.38  were  generated  to  show  how  the  M60A 1  model  changes  as  edges  are  removed. 
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Figure  3.4.30  Nodes  of  the  aspect  graph  for  the  M60al  using  only  those  surfaces  larger  than 
575  meters2. 


Figure  3.4.3 1  Views  of  M60A 1  corresponding  to  the  tessels  in  Figure  4. 1 2. 


Figure  3.4.31  (continued) 
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Figure  3.4.32  Angles  and  lengths  of  all  the  edges  in  the  M60A1  model.  Ordered  by  length. 
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Data  from  "M60A1  edge  angles" 
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Figure  3.4.33  Count  of  edges  with  given  angle  and  given  length  in  M60A1  model. 
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Figure  3.4.35  M60A1  with  edges  less  than  a  given  length  removed,  a)  original  M60A1  b)  less 
than  0.0  meters  c)  less  than  0.1  meters  d)  less  than  0.2  meters  e)  less  than  0.3 
meters  f)  less  than  0.4  meters  g)  less  than  0.5  meters  h)  less  than  0.6  meters  i) 
less  than  0.7  meters  j)  less  than  0.8  meters  k)  less  than  1.0  meters  1)  less  than  1.1 
meters  m)  less  than  1.2  meters  n)  less  than  1.4  meters  o)  less  than  1.5  meters  p) 
less  than  1.8  meters  q)  less  than  2.2  meters. 


M60A1  with  edges  less  70°  and  less  than  a  given  length  removed,  a)  original 
M60A 1.  b)  less  than  0.0  meters  c)  less  than  0.1  meters  d)  less  than  0.2  meters  e) 
less  than  0.3  meters  f)  less  than  1.1  meters  g)  less  than  2.2  meters. 
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Edges  arc  removed  from  the  model  by  assigning  the  two  surfaces  which  meet  at  the 
edge  the  same  surface  number.  This  is  done  by  assigning  both  surfaces  the  minimum  sur¬ 
face  number  of  the  two.  Figure  3.4.32  is  a  list  of  all  the  edges  in  the  M60A1  model, 
ordered  from  shorted  edge  to  longest.  In  Figures  3.4.35  and  3.4.36,  the  edges  were 
removed  in  the  order  show  in  Figure  3.4.32  (i.e.  based  on  length  only)  starting  with  the 
shortest  up  to  the  longest.  Remember  that  in  an  actual  system,  the  sensor  would  determine 
the  longest  detectable  edge,  therefore  it  would  not  have  to  compute  all  the  views  shown 
here.  Theses  views  are  presented  to  show  the  appearance  of  the  target  at  varying  resolu¬ 
tions. 

Figure  3.4.39  is  a  list  of  all  edges  in  the  M60A1  model  ordered  from  smallest  to  larg¬ 
est  angle  up  to  70°.  After  70°  they  are  ordered  from  shortest  to  longest.  Such  an  ordering 
might  be  used  if  is  was  know  that  a  sensor  could  not  reliably  detect  angles  less  than  70°. 
In  Figures  3.4.37  and  3.4.38,  the  edges  are  removed  in  the  order  show  in  Figure  3.4.39. 
The  model  simplifies  differently  than  when  only  length  is  used  (as  in  Figures  3.4.35  and 
3.4.36). 

Adjusting  both  the  edge  angle  threshold  and  the  edge  length  threshold  gives  great 
flexibility  in  reducing  the  complexity  of  a  target.  Since  no  information  is  available  con¬ 
cerning  the  angle  and  edge  lengths  that  can  be  detected,  and  angle  of  70°  and  0.1  meters  is 
length  was  chosen  (Figure  3.4.37c).  The  aspect  graph  is  shown  in  Figure  3.4.40  and  the 
corresponding  views  are  in  Figure  3.4.41. 

3.4.2.4.  Conclusions 

Targets  appearing  in  real  LADAR  data  can  vary  in  size  from  a  few  pixels  to  several 
thousand  pixels.  Such  a  wide  dynamic  range  in  perceived  target  size  introduces  some 
additional  complications  in  modeling  the  targets  for  recognition.  The  multi-resolution 
aspect  graph  is  a  data  structure  which  can  be  used  to  model  such  targets.  It  can  be  used  to 
retrieve  data  about  a  given  target  based  on  the  range  to  the  target. 

Adjusting  thresholds  based  on  the  surface  area,  edge  length,  and  edge  angle  gives 
great  control  over  how  a  model  is  simplified.  Future  work  must  determine  how  to  select 
these  thresholds  based  on  the  sensor  being  used. 
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Figure  3.4.40  Figure  3.4.26  with  viewpoints  grouped  together  which  view  the  same  surfaces  of 
an  M60A1  with  edges  longer  than  0.1  meters  and  angles  greater  than  70°. 


4.  ELECTRONIC  TERRAIN  BOARD  MODELING 


4.1.  ETBM  via  PADL 

When  developing  ATR  algorithms,  the  actual  data  from  the  sensor  (or  sensors)  should  be 
used  to  test  the  algorithm.  Unfortunately  this  data  is  not  always  readily  available;  field  tests  are 
expensive  and  often  postponed,  and  after  a  successful  field  test  new  scenarios  are  thought  of 
with  an  arrangement  of  targets,  background  and  clutter  not  collected  in  the  field  test.  To  relieve 
this  problem,  the  construction  of  an  Electronic  Terrain  Board  Model  is  being  pursued.  Any¬ 
one  with  such  a  system  will  be  able  to  run  “electronic”  field  tests  and  generate  realistic  data 
without  the  time  and  cost  of  a  conventional  field  test.  To  be  successful,  an  electronic  terrain 
board  must  contain  the  following  components: 

1 .  geometric  models  of  the  targets  of  interest, 

2.  models  of  background  and  foreground  clutter, 

3.  a  model  of  environmental  conditions,  and 

4  validation  routines  for  testing  the  integrity  of  the  simulated  data. 

The  following  sections  address  each  of  the  above  components. 

4.1.1.  Geometric  Models  of  Targets 

A  first-order  approximation  to  modeling  targets  can  be  as  simple  as  collecting  the  wire 
frame  information  about  each  target  to  be  modeled.  The  paragraphs  which  follow  describe  the 
conversion  of  the  wire  frame  data  from  ERIM  to  geometric  models. 

4. 1.1.1.  Converting  Wire  Frame  data  to  Geometric  Models 

ERIM  has  collected  wire  frame  data  which  describes  seventeen  different  vehicles.  Each 
description  contains  a  list  of  points  in  three  space,  a  list  of  edges  which  are  ordered  pairs  of 
points,  and  a  list  of  surfaces;  each  surface  is  a  list  of  edges  ordered  so  that  using  the  right  hand 
rule,  if  the  fingers  follow  the  edges,  the  thumb  is  the  outward  normal  of  the  surface.  Table  4.1.1 
shows  a  portion  of  each  list  for  the  M60A1  (All  units  are  in  meters).  Some  of  the  surface 
descriptions  list  edges  with  negative  values;  these  negative  values  indicate  that  the  order  of  the 
points  in  the  edge  is  reversed. 

PADL  [Padl],  a  solid  modeling  system  originally  designed  for  describing  industrial  parts, 
is  the  geometric  modeler  being  used.  Although  PADL  was  not  designed  for  target  simulation,  it 
does  have  facilities  for  computing  range  images  of  objects  which  is  what  is  needed.  PADL  does 
not  understand  points,  edges,  or  bounded  surfaces  as  input,  instead  it  uses  solid  objects  that  are 
bounded  by  planes;  therefore  a  program  was  written  to  convert  points,  edges  and  bounded  sur¬ 
faces  to  solid  objects.  This  program  required  human  input  since  the  wire  frame  data  did  not 
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123456789 

L  10  1112  13  14  15  16  17  18 

2.  19-18  20  21 

3.  22-14  23  24 

4.  25  26-20-17 

5.  27  -15-22  28 

6.  29-25-16-27 

7.  30-21  -26  31  32 

8.  33  -32  34  35 

9.  -24  36  37  38  -28 

10.  -37  39  40  41 

11.  42-31  -29-38 

12.  -2  43  44  45 

13.  -12  46  47  48 

14.  -44  49  50  51 

15.  -50  52  53  54 

16.  -47  55  56  57 

17.  -56  58  59  6 

18.  161  62  65  63  66  64 

19.  -45-51  -230-65 

20.  -55-46-66-231 

21.  -62  67  68-3 

22.  -67 -61  69  78 

23.  -69-64-11  71 

24.  72  73-4-68-70-71  -10  74 

25.  -74  -9  75  76  77 

26.  -75  -8  78  79 

27.  -78-7  80  81 

28.  -80  -6  82  83 

29.  -82  -5  -73  84 

30.  -72  -77  -76  85  86  -84 


Table  4.1.1  Partial  listing  of  points,  edges,  and  surfaces  for  the  wireframe  model  on  an 
M60A1.  (All  dimensions  are  in  meters.) 
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kak/yoder 


group  the  surfaces  into  objects  bounded  by  planes  as  was  needed  for  PADL.  Figure  4.1.1  is  a 
sample  PADL  input  for  the  M60A1.  Surface  1  in  Table  4.1.1  is  the  surface  between  the  turret 
and  the  main  hull.  It  is  a  collection  of  edges  1  through  1 8.  Edge  1  consists  of  points  1  and  3. 
Point  1  is  at  x=3.15,  y=1.81,  z=1.6.  The  next  section  gives  a  detailed  description  of  the  PADL 
code  in  Figure  4.1.1. 

4.1. 1.1.1.  PADL  Details 

Table  4.1.2  gives  a  list  a  short  description  of  the  PADL  commands  used  in  Figure  4.1.1. 
The  text  which  follows  gives  a  description  of  how  these  commands  are  used. 

Table  4. 1 .2  PADL  commands  used  in  Figure  4.1.1 


Command 

Description 

bio 

Block  primitive  object 

wed 

Wedge  primitive  object 

un 

Union  operator 

diff 

Difference  operator 

at 

Location  operator 

meta 

Defines  new  primitives 

plane 

Defines  a  plane 

PADL  has  many  primitive  objects  which  can  be  assembled  together  to  make  complex  3-D 
objects.  The  code  in  Figure  4.1.1  uses  the  primitive  objects  bio  (a  block)  and  wed  (a  wedge). 
(PADL  keywords  are  shown  in  boldface.)  These  primitives  are  combined  using  various  opera¬ 
tors.  The  two  operators  used  in  this  code  are  un  which  unions  two  objects  together  and  dif 
which  takes  the  difference  between  to  objects.  Objects  can  be  placed  at  a  specified  locations  by 
using  the  at  command.  For  these  targets,  the  positive  Z  -axis  is  up,  the  positive  Y  -axis  is  the 
front  of  the  target,  and  the  X  -axis  is  the  right  of  the  target  as  viewed  from  the  target.  PADL 
also  allows  new  primitive  objects  to  be  defined  by  using  meta.  The  meta  command  is  given  a 
list  of  planes  which  bound  the  object  in  three  space,  and  a  block  which  which  contains  the  new 
object.  Most  of  the  targets  were  defined  by  using  meta  objects  since  surfaces  were  given  in  the 
wire  frame  data. 

The  first  line  in  Figure  4.1.1  defines  an  m60al  to  be  the  union  of  the  m60al _hull , 
m60al  turret ,  m60al _sm  turret ,  and  m60al  gun .  (The  m60al _sm  turret  is  the  small  turret 
on  top  of  the  main  turret.)  The  next  line  defines  m60al  box  which  is  a  box  that  contains  the 
M60A1.  The  lines  that  follow  define  the  turret,  small  turret,  gun,  and  hull.  The  line 

m60al_turret  =  meta(m60al_turret0,  m60al_tunretl,  m60al_turret2, 
m60al_turret3,  m60al_turret4,  m60al_turret5,  m60al_turret6, 
m60al_turret7,  m60al_turret8,  box=m60al_turret_box) 


m60al  =  m60al_hull  un  m60al_turret  un  m60al_sm_turret  un  m60al_gun 
m60al_box  =  m60al_hull_box  un  m60al_turret_box  un  m60al_sm_turret_box 
un  m60al_gun_box 

m60al_turret  =  meta(m60al_turret0,  m60al_turretl,  m60al_turret2, 
m60al_turret3,  m60al_turret4,  m60al_turret5,  m60al_turret6, 
m60al_turret7,  m60al_turret8,  box=m60al_turret_box) 
m60al_turret0=plane=(rm=(degy=90.0,  degz=  180.0,  movx=-1.5,  movy=-1.05,  movz=  3.0)); 
m60al_turretl=plane=(rm=(degy=8 1.9022,  degz=-95.1428,  movx=-1.5,  movy=-1.05,  movz=  3.0)); 
m60al_turret2=plane=(rm=(degy=83.4285,  degz=-53.7462,  movx=0.5,  movy=-1.23,  movz=  3.0)); 
m60al_turret3=plane=(rm=(degy=90.0,  degz=  0.0,  movx=1.7,  movy=-0.485,  movz=2.2)); 
m60al_turret4=plane=(rm=(degy=85.0073,  degz=53.7462,  movx=1.7,  movy=0.485,  movz=2.2)); 
m60al_turret5=plane=(rm=(degy=8 1.9022,  degz=95.1428,  movx=0.5,  movy=1.23,  movz=  3.0)); 
m60al_turret6=plane=(rm=(degy=  0.0,  degz=  0.0,  movx=-1.5,  movy=1.05,  movz=  3.0)); 
m60al_turret7=plane=(rrn=(degy=33.6901,  degz=  0.0,  movx=0.5,  movy=-1.23,  movz=  3.0)); 
m60al_turret8=plane=(rm=(degy=  180.0,  degz=  0.0,  movx=-1.5,  movy=-1.25,  movz=1.6)); 
m60al_turret_box  =  blo(x=3.2,  y=2.86,  z=1.4)  at  (movx=-1.5,  movy=-1.43,  movz=1.6) 
m60al_sm_turret  =  meta(m60al_sm_turret0,  m60al_sm_turretl, 
m60al_sm_turret2,  m60al_sm_turret3,  m60al_sm_turret4, 
m60a  1  _sm_turret5,  box=m60a  1  _sm_turret_box) 

m60al_sm_turret0=plane=(rm=(degy=80.8304,  degz=95.0006,  movx=0.5,  movy=0.17,  movz=  3.0)); 
m60al_sm_turretl=plane=(rm=(degy=28.369,  degz=180.0,  movx=-l.l,  movy=0.03,  movz=  3.0)); 
m60al_sm_turret2=plane=(rm=(degy=81.4126,  degz=-94.6774,  movx=0.5,  movy=-1.18,  movz=3.27)); 
m60al_sm_turret3=plane=(rm=(degy=90.0,  degz=  0.0,  movx=0.5,  movy=-1.23,  movz=  3.0)); 
m60al_srn_turret4=plane=(rrn=(degy=  0.0,  degz=  0.0,  movx=0.5,  movy=-1.18,  movz=3.27)); 
m60al_srnjurret5=plane=(rrn=(degy=  180.0,  degz=  0.0,  movx=-l.l,  movy=0.03,  movz=  3.0)); 
m60al_sm_turret_box  =  b!o(x=1.6,  y=1.4,  z=0.27)  at  (movx=-l.l,  movy=-1.23,  movz=  3.0) 
m60al_gun  =  meta(m60al_gun0,  m60al_gunl,  m60al_gun2,  m60al_gun3, 
m60al_gun4,  box=m60al_gun_box) 

m60al_gun0=plane=(rm=(degy=  0.0,  degz=  0.0,  movx=5.96,  movy=0.05,  movz=2.2)); 
m60al_gunl=plane=(rm=(degy=90.0,  degz=90.0,  movx=1.7,  movy=0.05,  movz=2.2)); 
m60al_gun2=plane=(rrn=(degy=90.0,  degz=-90.0,  movx=5.96,  movy=-0.05,  movz=2.2)); 
m60al_gun3=plaue=(rm=(degy=90.0,  degz=  0.0,  movx=5.96,  movy=0.05,  movz=2.2)); 
m60al_gun4=plane=(rm=(degy=  180.0,  degz=  0.0,  movx=1.7,  movy=0.05.  movz=2. 1)); 
m60al_gun_box  =  blo(x=4.26,  y=0. 1,  z=0.1)  at  (movx=1.7,  movy=-0.05,  movz=2. 1 ) 


Figure  4. 1. 1  PADL  input  for  M60A1. 
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m60al_hull_con  =  meta(m60al_hull_con0,  m60al_hull_conl, 

m60al_hull_con2,  m60al_hall_con3,  m60al_hull_con4,  m60al_hull_con5, 
m60al_hull_con6,  m60al_hiill_con7,  m60al_hulLcon8,  m60al_hull_con9, 
m60al_hull_conl0,  m60al_hull_conll,  m60al_hull_conl2, 
m60al_hull_conl3,  m60al_hull_conl4,  m60al_hull_conl5, 
m60a  1  _hull_con  1 6,  box =m60a  1  _h ul l_con_box) 
m60al_hull_con0=plane=(rm=(degy=  0.0,  degz=  0.0,  movx=3.15,  movy=1.81,  movz=1.6)); 
m60al_hull_conl=plane=(rm=(degy=32.0054,  degz-  0.0,  movx=3.47,  movy=1.81,  movz=1.4)); 
m60al_hull_con2=plane=(rm=(degy=32.0054,  degz=  0.0,  movx=3.47,  movy=-l.l,  movz=1.4)); 
m60al_hull_con3=plane=(rm=(degy=123.818,  degz=  0.0,  movx=2.78,  movy=1.81,  movz=0.37)); 
m60al_hul!_con4=plane=(rm=(degy=155.179,  degz=  0.0,  movx=1.98,  movy=1.8l,  movz-  0.0)); 
m60al_hull_con5=plane=(rm=(degy=123.818,  degz=  0.0,  movx=3.47,  movy=-l.l,  movz=1.4)); 
m60al_hull_con6=plane=(rm=(degy=155.179,  degz=  0.0,  movx=2.78,  movy=-l.l,  movz=0.37)); 
m60al_hull_con7=plane={rm=(degy=42.2737,  degz=180.0,  movx=-3.25,  movy=l.l,  movz=1.6)); 
m60al_hull_con8=plane=(rm=(degy=42.2737,  degz=180.0,  movx=-3.25,  movy=-1.81,  movz=1.6)); 
m60al_hull_con9=plane=(rm=(degy=103.039,  degz=-180.0,  movx=-3.47,  movy=l.l,  movz=1.4)); 
m60al_hull_conl0=plane=(rm=(degy=148.696,  degz=-180.0,  movx=-3.25,  movy=l.l,  movz=0.45)); 
m60al_hull_conl  l=plane=(rm=(degy=103.039,  degz=-180.0,  movx=-3.47,  movy=-1.81,  movz=1.4)); 
m60al_hull_conl2=plane=(rm=(degy=148.696,  degz=-180.0,  movx=-3.25,  movy=-1.81,  movz=0.45)); 
m60al_hull_conl3=plane=(rm=(degy=900,  degz=-90.0,  movx=-2.51,  movy=-1.81,  movz=  0.0)); 
m60al_hull_conl4=plane=(rrn=(degy=90.0,  degz=90.0,  movx=1.98,  movy=1.81,  movz=  0.0)); 
m60al_hulLconl5=plane=(rm=(degy= 180.0,  dcgz=  0.0,  movx=-2.51,  rrovy=1.81,  movz=  0.0)); 
m60al_hull_conl6=p!ane=(rm=(degy=180.0,  degz=  0.0,  movx=1.98,  movy=-1.81,  movz=  0.0)); 
m60al_hull_con_box  =  blo(x=6.94,  y=3.62,  z=1.6)  at  (movx=-3.47,  movy=-1.81,  movz=  0.0) 
m60al_hull_bottom  =  blo(x=6.03,  y=2.2,  z=0.37)  at  movx=-3.25,  movy=-l.l,  movz=  0.0 
m60al_engine  =  blo(x=1.75,  y=2.2,  z=1.4)  at  movx=-3.25,  movy=-l.l,  movz=0.6 
m60al_hood  =  wed(x=2.2,  y=0.3,  z=1.47)  at  degz=90,  degy=-90, 
movx=3.47,  movy=-l.l,  movz=1.6 

m60al_hull  =  (m60al_hull_con  dif  m60al_hood  dif  m60al_hull_bottom)  un  m60al_engine 


Figure  4.1.1  Continued. 
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defines  the  turret  which  is  a  meta  object  consisting  of  the  nine  planes: 

m60al_turret0=plane=(rm=(degy=90.0,  degz=180.0,  movx=-1.5,  movy=-1.05,  movz=  3.0)); 
m60al_turretl=plane=(rm=(degy=81.9022,  degz=-95.1428,  movx=-1.5,  movy=-1.05,  movz=  3.0)); 
m60al_turret2=plane=(rm=(degy=83.4285,  degz=-53.7462,  movx=0.5,  movy=-1.23,  movz=  3.0)); 
m60al_turret3=p!ane=(rm=(degy=90.0,  degz=  0.0,  movx=1.7,  movy=-0.485,  movz=2.2)); 
m60al_turret4=plane=(rm=(degy=85.0073,  degz=53.7462,  movx=1.7,  movy=0.485,  movz=2.2)); 
m60al_turret5=plane=(rm=(degy=81.9022,  degz=95.1428,  movx=0.5,  movy=1.23,  movz=  3.0)); 
m60al_turret6=plane=(rm=(degy=  0.0,  degz=  0.0,  movx=-1.5.  movy=1.05,  movz=  3.0)); 
m60al_turret7=plane=(nn=(degy=33.6901,  degz=  0.0,  movx=0.5,  movy=-1.23,  movz=  3.U)); 
m60al_turret8=plane=(rm=(degy=  180.0,  degz=  0.0,  movx=-1.5,  movy=-1.25,  movz=1.6)); 
m60al_turret_box  =  blo(x=3.2,  y=2.86,  z=1.4)  at  (movx=-1.5,  movy=-1.43,  movz=1.6) 

which  are  displayed  in  Figure  4.1.2.  The  orientation  of  the  planes  came  from  the  surface  infor¬ 
mation  in  the  wire  frame  data  for  the  M60A1.  The  definition  of  the  small  turret,  sm  turret,  is 

m60al_sm_turret  =  meta(m60al_sm_turret0,  m60al_sm_turretl, 
m60al_sm_turret2,  m60al_sm_turret3,  m60al_sm_turret4, 
m60a  1  _sm_turret5 ,  box=m60a  1  _sm_turre t_box) 

m60al_sm_turret0-plane=(rm=(degy=80.8304,  degz=95.0006,  movx=0.5,  movy=0.17,  movz=  3.0)); 
m60al_sm_turretl=plane=(rm=(degy=28.369,  degz=180.0,  movx=-l.l,  movy=0.03,  movz=  3.0)); 
m60al_sm_turret2=plane=(rm=(degy=81.4126,  degz=-94.6774,  movx=0.5,  movy=-1.18,  movz=3.27 
m60al_sm_turret3=plane=(rm=(degy=90.0,  degz=  0.0,  movx=0.5,  movy=-1.23,  movz=  3.0)); 
m60al_sm_turret4=plane=(rm=(degy=  0.0,  degz=  0.0,  movx=0.5,  movy=-1.18,  movz=3.27)); 
m60al_sm_turret5=plane=(rm=(degy= 180.0,  degz=  0.0,  movx=-l.l,  movy=0.03,  movz=  3.0)); 
m60al_sm_turret_box  =  blo(x=1.6,  y=1.4,  z=0.27)  at  (movx=-l.l,  movy=-1.23,  movz=  3.0) 

and  the  gun  is 

m60al_gun  =  meta(m60al_gun0,  m60al_gunl,  m60al_gun2,  m60al_gun3, 
m60al_gun4,  box=m60al_gun_box) 

m60al_gun0=plane=(rm=(degy=  0.0,  degz=  0.0,  movx=5.96,  movy=0.05,  movz=2.2)); 
m60al_gunl=plane=(rm=(degy=90.0,  degz=90.0,  movx=1.7,  movy=0.05,  movz=2.2)); 
m60al_gun2=plane=(rm=(degy=90.0,  degz=-90.0,  movx=5.96,  movy=-0.05,  movz=2.2)); 
m60al_gun3=plane=(rm=(degy=90.0,  degz=  0.0,  movx=5.96,  movy=0.05,  movz=2.2)); 
m60al_gun4=plane=(rm=(degy=  180.0,  degz=  0.0,  movx=1.7,  movy=0.05,  movz=2.1)); 
m60al_gun_box  =  blo(x=4.26,  y=0.1,  z-0  1)  at  (movx=1.7,  movy=-0.05,  movz=2.1) 

These  two  are  also  both  meta  objects  which  are  bounded  by  planes  as  displayed  in  Figure  4.1.2. 

The  following  line 


Figure  4.1.2  the  PADL  objects  m60al  turret ,  m60a  1  _sm  turret ,  and  m60al  gun  from  Fig¬ 
ure  4. 1 . 1 . 
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m60al_hull_con  =  meta(m60al_hull_con0,  m60al_hull_conl, 

m60al_hull_con2,  m60al_hull_con3,  m60al_hull_con4,  m60al_hull_con5, 
m60al_hull_con6,  m60al_hull_con7,  m60al_hull_con8,  m60al_hull_con9, 
m60al_hull_conl0,  m60al_hull_conl  1,  m60al_hull_conl2, 
m60al_hull_conl3,  m60al_hull_conl4,  m60al_hull_conl5, 
m60a  l_hull_con  1 6,  box=m60a  1  _hull_con_box) 

which  is  the  first  line  of  the  second  page  of  Figure  4.1.1  defines  the  convex  part  of  the  hull  of 
the  M60A1.  PADL  meta  objects  must  be  convex.  The  hull  of  the  M60A1  has  some  concavi¬ 
ties,  therefore  it  is  defined  by  the  convex  part  m60al _hull_con  and  subtracting  off  (using  dif) 
the  concave  parts.  Figure  4.1.3  shows  the  convex  part,  m60al  hull  con ,  and  the  engine  box 
m60al  engine ,  the  wedge  above  the  hood,  m60al  hood,  and  the  space  between  the  treads, 
m60al _hul  bottom .  Figure  4.1.4  shows  the  M60A1  hull,  m60al  hull  after  subtracting  the 
concave  parts  off. 

The  other  vehicles  are  defined  in  a  similar  way.  After  converting,  the  vehicles  were  read 
into  PADL  and  the  wire  frame,  shaded,  and  range  images  in  Figure  4. 1.5-4. 1.8  were  created. 
Although  the  shaded  image  have  the  light  source  to  the  left  of  the  viewer,  they  are  displayed 
here  as  negatives,  so  the  bright  surfaces  appear  dark.  Only  the  range  images  in  the  figures  are 
used  in  our  simulations.  All  the  vehicles  in  Figure  4.1.5  -  4.1.8  were  scaled  so  that  they  were 
the  size  that  would  be  seen  by  a  sensor  at  500  meters  using  a  0.05  mrad  instantaneous  field  of 
view  (IFOV)t  in  both  directions. 

4.1. 1.2.  The  Electronic  Field  Test 

With  these  models  in  hand  an  electronic  field  test  was  performed  which  generated  noise¬ 
less  LADAR  images  of  each  target  as  viewed  from  500  meters.  Each  test  consisted  of  60  views 
of  each  target,  one  view  every  6  degrees.  With  this  data  we  were  able  to  generate  images  from 
any  multiple  of  500  meters  and  any  multiple  of  0.05  mrad  IFOV,  by  down  sampling  the  targets. 
Figure  4.1.9  shows  an  M60A1  as  viewed  from  500m  to  5km. 

4. 1.1. 3.  Conclusions 

We  are  now  able  to  model  targets  “as  they  come  from  the  factory”.  For  many  of  our 
experiments  this  data  will  be  sufficient.  Future  work  in  the  modeling  area  will  look  into  model¬ 
ing  targets  “after  user  modifications”.  That  is,  it  is  a  common  practice  for  an  operator  of  a 
vehicle  to  attach  clutter  to  the  target  once  a  vehicle  is  in  the  field.  Such  clutter  should  be  easily 
modeled  by  the  fractal  trees  discussed  in  the  sections  that  follow. 


+  The  IFOV  is  the  angular  measurement  between  adjacent  pixels. 


Figure  4.1.3  The  PADL  objects  m60al  hull  con ,  m60al  hull  bottom ,  m60al  engine ,  and 
m60al  hood  from  Figure  4.1.1. 
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Figure  4,1.8  Wireframe,  shaded,  and  range  image  of  M60A1. 
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4.1.2.  Modeling  Clutter 

Modeling  targets  alone  is  not  enough  to  generate  realistic  scenes.  With  only  models  of  tar¬ 
gets  we  can  simulate  a  target  sitting  in  the  middle  of  a  plane  with  no  clutter  around  it.  Unfor¬ 
tunately  real  targets  are  both  surrounded  by  clutter  and  have  clutter  attached  to  them.  Fractals 
have  been  shown  to  accurately  model  natural  objects  [Mand].  The  following  sections  give  a 
brief  introduction  to  fractals  and  show  how  we  used  them  to  model  the  terrain  of  the  earth  and 
trees. 

4. 1.2.1.  Fractals 

Fractals,  as  defined  by  Mandelbrot  [Mand],  are  a  family  of  shapes  which  describe  many  of 
the  fragmented  and  irregular  shapes  around  us.  The  most  useful  fractals  involve  chance  and 
both  their  regularities  and  their  irregularities  are  statistical.  The  most  useful  feature  of  fractals 
for  generating  clutter  is  that  the  irregularity  is  similar  at  all  scales.  That  is,  a  mountain  can  be 
viewed  at  10km  and  many  irregularities  can  be  see.  Moving  to  1km  will  reveil  many  finer  irre¬ 
gularities.  These  fragmented  patterns  described  by  fractals  are  needed  to  model  clutter,  whether 
it  is  at  10km  or  10  meters. 

4. 1.2.2.  Fractal-Based  Terrain  Generation 

The  generation  of  mountains  is  one  of  the  areas  fractals  have  been  successful  in  generating 
realistic  looking  objects  using  only  a  few  parameters.  Komfeld  [Komi]  used  fractals  to  gen¬ 
erate  the  crest  line  structures  used  in  the  background  of  her  simulated  FLIR  images.  Figure 
4.1.10  shows  how  she  approximates  mountain  ranges  using  flat  layers  at  different  distances 
from  the  viewer.  The  top  edges  of  the  crests  were  created  using  fractals.  Such  an  approach  is  a 
nice  simplification  which  allows  the  simulation  program  to  synthesize  the  FLIR  image  very 
quickly,  however  it  cannot  be  used  for  simulating  range  images  for  the  obvious  reason  that  the 
sensor  would  pick  out  the  flat  layers. 

Fournier  et  al.  [FFC]  use  fractals  to  generate  3-D  mountains  by  computing  the  elevation  of 
points  above  a  grid.  The  grid,  as  described  here,  is  merely  a  collection  of  numbers  which  can  be 
interpreted  as  elevations  above  a  plane.  The  spacing  between  grid  points  is  not  of  interest  here 
and  is  later  set  according  to  the  dimensions  of  other  objects  in  the  synthetic  range  image.  Each 
grid  point  has  a  certain  neighborhood  associated  with  it.  The  elevation  of  the  point  is  just  the 
average  of  the  neighborhood  perturbed  by  a  random  amount.  Also,  the  orientation  of  the  neigh¬ 
borhood  relative  to  the  point  of  interest  depends  on  its  position.  Figure  4.1.1 1  shows  the  order 
in  which  elevations  are  computed. 

The  grid  is  filled  in  as  follows:  After  setting  the  four  comer  elevations,  filling  the  rest  of 
the  grid  is  merely  an  exercise  in  recursion.  First  the  comer  points  (the  ones  labeled  0)  are 
selected.  The  values  chosen  for  these  points  depend  on  the  type  of  terrain  the  designer  is  simu¬ 
lating.  If  a  rolling  plane  is  being  simulated,  all  four  points  will  be  given  about  the  same  value. 
If  the  side  of  a  mountain  is  being  simulated,  the  bottom  points  labeled  0  may  be  given  one 


Figure  4. 1.1 1  Order  of  grid  point  computation.  The  four  comer  points  are  set  before  computa¬ 
tion  begins.  Order  of  computation  is:  la,  lb,  2a,  2b.  An  (*)  indicates  a  point 
interpolated  from  boundary  values  only. 
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value,  and  the  top  points  labeled  0  will  be  given  a  greater  value.  After  the  comer  points  are 
labeled  the  routine  will  pick  a  value  for  the  point  labeled  la  by  finding  the  average  for  all  the 
points  labeled  0  and  then  adding  a  random  value  to  it.  (The  neighbors  of  la  are  the  0  points.) 
Next  the  values  for  the  lb  points  are  found  by  averaging  the  0  points  to  the  left  and  right  (or 
above  and  below  if  the  lb  is  on  the  left  or  right)  with  the  value  of  the  la  point.  Next  the  values 
of  the  2a  points  are  found  by  averaging  the  neighbors  that  are  shown  with  arrows  in  Figure 
4.1.1 1.  Likewise  with  point  2b.  Notice  that  the  neighbors  of  la  and  lb  are  above,  below,  left, 
and  right  of  of  them  while  the  neighbors  of  2a  and  2b  are  oriented  differently  in  that  they  are 
on  the  diagonals  from  them. 

Figure  4.1.12  shows  the  different  phases  of  a  terrain  patch  at  different  times  in  the  process 
of  adding  more  detail.  Figure  4.1.13  shows  a  patch  of  terrain  with  some  targets  on  it. 

4.I.2.3.  Fractal-Based  Tree  Generation 

Background  clutter  such  as  mountains  is  not  the  only  form  of  clutter  in  a  range  image. 
Another  more  difficult  form  of  clutter  is  foreground  clutter,  which  is  more  difficult  because  it 
can  obscure  the  targets  of  interest.  Our  first  approach  to  simulating  this  type  of  clutter  is  to 
simulate  trees.  The  next  sections  review  a  couple  of  techniques  used  to  simulate  trees  in  FL1R 
simulations  and  the  final  section  shows  the  approach  being  taken  here. 

4.L2.3.1.  Tree  Simulation  at  CNVEO 

Gertrude  Komfeld  [Kom2|  simulates  trees  by  cutting  a  tree  out  of  actual  FLIR  imagery 
and  pasting  it  into  the  simulated  imagery.  One  problem  with  this  approach  is  that  all  the  trees 
will  look  the  same.  To  overcome  this  problem  she  distorts  the  tree  image  through  non-linear 
scaling  before  pasting  it  into  the  synthetic  image.  Using  this  approach  she  can  have  several 
trees  (46  in  one  example)  in  a  scene  which  are  distortions  of  just  a  few  trees  (three  trees  in  the 
same  example). 

The  main  problem  with  this  approach  is  that  like  the  mountains,  the  trees  are  flat.  The 
same  tree  cannot  be  viewed  from  different  angles.  For  her  work  this  is  fine,  but  for  an  Elec¬ 
tronic  Terrain  Board  Model,  one  must  be  able  to  describe  the  terrain  board  in  three  space  and 
then  view  the  objects  from  any  angle. 

4.L2.3.2.  The  GTR I  Model  of  Trees 

The  Georgia  Tech  Research  Institute  has  a  FLIR  simulator  called  GTVISIT  which  contains 
a  three  dimensional  tree  model.  This  model  (shown  in  Figure  4.1.14)  contains  over  32,000 
facets  which  describe  the  tree’s  reflectivity  in  three  dimensions.  The  simulator  allows  the  tree 
to  be  copied,  scaled,  rotated,  and  placed  anywhere  in  the  scene.  This  approach  overcomes  the 
problem  of  having  a  2-D  tree,  but  due  to  the  large  size  of  the  tree,  it  is  the  only  hardwood  tree  in 
the  model,  and  scaling  and  rotation  are  used  to  give  the  appearance  of  different  trees. 


Figure  4.1.12  (Continued) 


Figure  4.1.13  Shaded  and  range  images  of  a  simulated  M60al  and  a  Ml  13  on  a  terrain  patch. 


Figure  4.1.14  Simulated  hardwood  tree  in  GTVISIT  model.  The  same  tree  appears  three  times, 
each  time  it  is  scaled  and  rotated  differently. 
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4.1.2.3.3.  The  Fractal  Approach  to  Trees 

A  three  dimensional  model  of  a  tree  must  be  used  for  simulating  LADAR  images.  This 
model  should  be  detailed  enough  to  look  like  a  tree  to  the  sensor  being  modeled,  but  be  simple 
enough  for  several  different  trees  to  be  used  in  the  same  image.  One  approach  is  to  grow  the 
trees  as  they  are  needed  by  using  a  formal  grammar.  This  approach,  presented  in  [Smit],  uses  a 
formal  grammar  (an  example  is  shown  in  Figure  4.1.15)  which  describes  how  the  tree  is  grown 
from  one  generation  to  the  next.  Each  generation  is  grown  by  taking  the  symbols  on  the  left 
side  of  the  production  rules  and  replacing  them  with  the  symbols  on  the  right  of  the  production 
rules.  If  we  define  the  Z-axis  as  being  up,  the  angle  of  the  branches  around  the  Z-axis  and  down 
from  the  Z-axis  can  be  randomly  chosen.  The  range  of  the  random  numbers  determines  the  type 
of  tree  being  grown. 

This  approach  has  the  advantage  that  every  tree  used  can  be  a  different  tree.  This  reduces 
the  possibility  of  an  algorithm  becoming  accidentally  tuned  to  a  given  tree.  Once  a  tree  is 
grown,  it  can  be  placed  on  the  Electronic  Terrain  Board  Model  and  viewed  from  any  angle. 
Figure  4.1.16  is  a  fourth  generation  tree  using  the  grammar  in  Figure  4.1.15.  The  branches  of 
the  tree  are  cylinders,  all  of  the  same  diameter,  and  there  are  no  leaves  on  the  tree.  Future  work 
should  make  the  tree  look  more  like  a  tree  as  viewed  by  a  LADAR  sensor. 

4. 1.2.4.  Real  Terrain  Board  Data 

Although  fractals  can  be  used  to  generate  real  looking  terrain,  real  terrain  data  looks  even 
more  real.  Figure  4.1.17  is  an  image  created  using  the  elevation  data  from  the  Night  Vision 
Laboratory  Terrain  Model.  This  data  could  have  been  used  in  generating  the  scenes  in  this 
report,  however  it  was  not  available  at  the  time  time  work  was  being  done. 

4.1.3.  Noise  Degradation  of  Synthetic  Range  Imagery 

The  next  two  components  of  the  Electronics  Terrain  Board  Model  which  need  to  be 
addressed  are  the  modeling  of  the  sensor  and  the  current  environmental  conditions.  The  pur¬ 
pose  of  these  models  is  to  add  the  correct  type  and  amount  of  noise  to  the  synthesized  image  to 
simulate  the  sensor  and  how  it  interacts  with  the  environment.  We  lacked  specific  information 
on  the  LADAR  sensor,  so  we  chose  to  examine  the  noise  present  in  the  LADAR  image  taken 
during  the  1986  A.P.  Hill  test.  The  next  sections  describe  our  analysis  of  the  noise,  and  the 
creation  of  noisy  synthetic  images. 

Both  the  general  nature  of  the  noise  (e.g.  Gaussian,  uniform,  etc.),  and  how  the  noise 
varies  with  range  are  studied  here. 

4.I.3.I.  Analysis  of  LADAR  Noise 

The  analysis  which  follows  is  based  on  experiments  performed  with  the  old  laser  range 
data,  i.e.  background  pixels  are  merely  noise.  Depending  on  the  changes  made  to  the  sensor, 


Figure  4.1.15  Grammar  for  growing  a  tree,  (a)  Production  rules,  (b)  Generation 


Figure  4.1.17a  Mountains  generated  from  NVL  terrain  board  data.  Facet  image. 


Figure  4. 1 . 17b  Mountains  generated  from  NVL  terrain  board  data.  Shaded  image. 


Figure  4.1.17c  Mountains  generated  from  NVL  terrain  board  data.  Shaded  image  with  smooth 
ing. 
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the  results  presented  may  change  drastically.  Recall,  however,  that  the  main  objective  here  is  to 
determine  whether  realistic  images  can  be  generated,  and  the  actual  data  used  is  the  only  data 
available  for  comparison. 

4.1.3.L1.  General  Characteristics 

The  five  ton  truck  target  is  used  for  the  analysis  in  this  section  and  the  next.  This  target 
provides  a  large  flat  surface  from  which  to  extract  a  window  of  data  at  ranges  of  1.54,  2.91,  and 
4.24  km. 

Figure  4.1.18  is  a  sample  image  from  the  1.54  km  set  showing  the  hand  selected  rectangu¬ 
lar  window  used  to  extract  noise  data  from  only  the  target.  In  many  simulation  type  problems  it 
may  be  possible  to  assume  Gaussian  noise  characteristics  However  the  histogram  of  Figure 
4.1.19  shows  the  data  of  the  above  window  clearly  indicates  that  a  more  complex  model  is 
necessary  here.  Notice  that  the  extremities  of  the  histogram  are  flat  and  exhibit  a  uniform  den¬ 
sity  characteristic  while  the  central  region  looks  somewhat  Gaussian. 

As  is  apparent  from  the  background  of  Figure  4. 1.18,  if  an  unreasonable  return  or  no  return 
is  received  by  the  sensor,  an  arbitrary  gray  level  between  0  and  255  is  assigned.  It  seems  rea¬ 
sonable  to  expect  that  some  on-target  pixels  may  be  assigned  similarly.  The  histogram  of  Fig¬ 
ure  4.1.19  supports  this  since  most  pixels  contain  a  gray  level  corresponding  to  a  range  meas¬ 
urement  corrupted  by  noise  (the  central  region  of  the  histogram)  while  the  rest  represent  a  gray 
level  chosen  at  random. 

Once  the  random  pixels  are  separated  from  the  data  the  remaining  noise  is  approximated  as 
Gaussian  since  many  factors  act  together  to  perturb  a  given  range  measurement.  Assuming 
ergodicity  (as  is  done  throughout  this  analysis  and  the  sections  to  follow),  the  relative  frequency 
of  randomly  chosen  gray  levels  within  a  given  window  is  a  good  approximation  of  the  probabil¬ 
ity  that  a  given  pixel  has  a  random  gray  level.  The  value  of  this  is  in  determining  which  pixels 
of  a  synthetic  range  image  should  be  corrupted  with  uniform  noise. 

As  was  pointed  out  earlier,  the  far  ends  of  the  histogram  have  the  uniform  type  of  charac¬ 
teristic  which  would  be  evident  in  samples  chosen  completely  at  random.  Thresholds  are 
chosen,  therefore,  beyond  which  the  histogram  is  considered  to  be  uniform.  Next  it  is  necessary 
to  determine  how  many  of  the  gray  levels  of  the  central  region  are  due  to  random  assignment. 
This  is  simple  since  a  basic  property  of  the  uniform  distribution  is  an  even  spread  throughout 
the  possible  range  of  values.  The  following  is  an  example  of  the  necessary  interpolation: 


Range  =  1.54  km,  ap  1.32845 


Figure  4.1.18  5  Ton  Truck,  Image  45,  Date  3/28/86,  Range  1.54  km.  Hand  selected  window  of 
1416  pixels  from  160  x  96  pixel  laser  range  image. 
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HIGH  =  gray  level  greater  than  mean,  above  which  the  histogram  is 
considered  uniform. 

LOW  =  gray  level  less  than  mean,  below  which  the  histogram  is  considered 
uniform. 

N  =  (255  -  HIGH  +  1)  +  ( LOW  +  1) 

=  number  of  gray  levels  in  the  uniform  regions. 

P  =  number  of  pixels  in  uniform  regions. 

PIN  =  average  number  of  pixels  per  gray  level  in  uniform  region 
X  =  (PIN)  *  (HIGH  -  LOW  +  I ) 

=  approximate  number  of  random  pixels  between  HIGH  and  LOW. 

The  analysis  above  completes  one  important  step  in  the  noise  synthesis  problem.  Given  a 
synthetic  range  image  of  a  target,  it  is  now  known  what  fraction  of  on-target  pixels  should  be 
corrupted  by  random  noise.  The  next  task  is  to  determine  the  variance  of  the  Gaussian  portion. 

In  order  to  estimate  the  Gaussian  characteristic  it  is  merely  necessary  to  subtract  the  ran¬ 
dom  pixels  from  the  aggregate  noise  histogram.  The  problem  is  determining  which  pixels 
between  HIGH  and  LOW  are  random,  and  which  are  true  measurements  perturbed  by  Gaussian 
noise.  These  random  pixels  are  approximated  by  randomly  choosing  X  gray  levels  between 
HIGH  and  LOW.  If  the  random  pixel  value  chosen  did  not  appear  in  the  original  image,  the 
value  will  be  discarded  and  a  new  random  value  selected.  After  subtracting  these  from  the 
aggregate  noise,  an  approximation  of  the  Gaussian  noise  remains.  See  Figures  4.1.20  and 
4.1.21.  Again,  the  histogram  of  Figure  4.1.20  does  not  convey  much  useful  information  by 
itself,  but  it  is  used  to  extract  the  Gaussian  data  contained  in  the  aggregate  noise  histogram. 

This  concludes  the  examination  of  the  general  noise  characteristics.  The  following  section 
addresses  the  question  of  how  these  characteristics  depend  on  range. 

4. 1.3. 1.2.  Noise  Variation  With  Range 

Only  two  parameters  of  the  previous  section  will  be  examined  here  to  determine  their  vari¬ 
ation  with  range.  They  are: 

( 1 )  The  drop-out  probability. 

(2)  The  variance  of  the  Gaussian  data. 

Item  (1)  is  essentially  the  relative  frequency  of  pixels  within  a  given  data  window  that  have  ran¬ 
dom  gray  levels.  Item  (2)  follows  from  a  straightforward  calculation  following  the  subtraction 
of  random  pixels  from  the  aggregate  noise  data. 

The  five  ton  truck  target  is  again  used  in  the  following  study,  but  at  ranges  of  1.54,  2.91, 
and  4.24  km.  Figures  4.1.18,  4.1.22,  and  4.1.23  show  sample  data  at  these  ranges  along  with  the 
windows  used  for  data  extraction.  Figures  4.1.19,  4.1.24,  and  4.1.25  are  histograms  of  this  data. 


Pixel  Count 


Figure  4.1.20  Histogram  of  uniform  gray  levels  extracted  from  aggregate  histogram  of  1416 
pixel  window. 
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Figure  4.1.21  Histogram  of  Gaussian  gray  levels  extracted  from  aggregate  histogram  of  1416 
pixel  window.  Mean  =  122.603,  Standard  Deviation  =  12.039. 


Range  =  2.91  km,  ap  1.32702 


Figure  4.1.22  5  Ton  Truck,  Image  02,  Date  3/27/86,  Range  2.91  km.  Hand  selected  window  of 
1325  pixels  from  160  x  96  pixel  laser  range  image. 


Range  =  4.24  km,  apl. 32801 


Figure  4.1.23  5  Ton  Truck,  Image  01,  Date  3/28/86,  Range  4.24  km.  Hand  selected  window  of 
375  pixels  from  160  x  96  pixel  laser  range  image. 
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Figure  4.1.24  Aggregate  gray  level  histogram  of  1325  pixel  window.  Mean  =  129.001,  Stan¬ 
dard  Deviation  =  14.862. 


Pixel  Count 


23  0000 


pQC  _  _  _ 


20  1250 


17  2500 


14  3750 


11  5000 


8  62500 


5  75000 


2  87500 


0  00000 


UP!  fin  finr 


1  S  11  ! 


32  6 14  96  129  160  192  224  256 

^  f  '  '  j  o  -  -  p  1  C-  ^nl  TOQO  1 


Mean  =  -122  491,  St 


^  ~  -  O'!  OC  ' 

O1'-'-'  L_^>  'VC'  'Ut 


Figure  4. 1 .25  Aggregate  gray  level  histogram  of  375  pixel 

Deviation  =  23.961. 
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First  the  separation  process  discussed  above  is  applied  to  the  images  of  each  range  meas¬ 
urement  set.  Next,  the  parameters  of  interest  are  calculated.  Plots  of  this  information  for  all 
three  range  sets  are  superimposed  in  Figures  4.1.26  and  4.1.27.  Table  4.1.3  summarizes  the 
results. 

Table  4.1.3  Summary  of  drop-out  probabilities  and  Gaussian  standard  deviation  statistics  for 
1.54,  2.91,  and  4.24  km.  data  sets. 


Range 

Drop-out 

Probability 

Gaussian 

Standard  Deviation 

(km) 

Max. 

Min. 

Avg. 

Max. 

Min. 

Avg. 

1.54 

0.102 

0.031 

0.073 

12.237 

9.388 

11.199 

2.91 

0.032 

0.028 

0.030 

10.816 

8.777 

9.797 

4.21 

0.099 

0.005 

0.027 

17.199 

7.001 

10.353 

There  are  some  surprises  in  these  results.  For  example,  the  only  apparent  variation  with 
range  is  actually  counterintuitive.  The  drop-out  probability  is  consistently  higher  for  the  1.54 
km  data  set  then  for  the  other  two  data  sets.  The  reason  for  this  is  unclear  at  present.  Another 
interesting  note  is  the  wide  variation  in  the  data  quality  of  the  4.24  km  data  set.  The  images 
corresponding  to  the  highest  and  lowest  points  of  the  standard  deviation  plot  for  this  data  set  are 
given  in  Figures  4. 1 .28  and  4. 1 .29. 

4.I.3.I.3.  Summary 

There  are  many  factors  that  could  effect  range  data  that  have  not  been  addressed  here.  The 
tuning  of  the  sensor  for  the  different  range  sets,  or  a  difference  in  prevailing  weather  conditions 
might  have  a  dramatic  effect  on  the  characteristics  of  the  data.  But,  from  the  data  presented,  it 
appears  that  the  parameters  discussed  do  not  vary  with  range  by  any  appreciable  amount. 

Even  with  the  unknown  factors  mentioned  here,  the  results  are  still  very  useful  in  noise 
synthesis.  The  separation  of  noise  into  two  types  of  distributions  along  with  the  averages  and 
bounds  of  Table  4.1.3  will  make  it  possible  to  realistically  degrade  synthetic  range  imagery. 

4. 1.3.2.  Noise  Degradation  of  Synthetic  Imagery 

As  with  the  rest  of  the  noise  study  presented  so  far,  the  primary  parameters  of  interest  are: 
the  “Drop-Out  Probability”  (DOP),  and  the  Gaussian  Standard  Deviation  (GSD).  A  range  of 
values  for  these  parameters  exists  from  the  results  presented  in  the  previous  section. 

Again,  by  the  ergodicity  assumption,  these  parameters  which  have  been  determined 
through  spatial  averaging  will  be  applied  to  synthetic  imagery  on  a  pixel  by  pixel  basis  to  pro¬ 
vide  realistic  noise  degradation.  For  example,  each  on  target  pixel  should  represent  either  a 
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Figure  4. 1.26  Composite  graph  of  1.54,  2.91,  and  4.24  km.  data.  Shows  image  to  image  varia 
tion  of  drop-off  probability. 


Range  =  2.91  km,  apl. 32828 


Figure  4.1.28  5  Ton  Truck,  Image  28,  Date  3/28/86,  Range  2.91  km.  Image  with  the  largest 
Gaussian  standard  deviation.  Gaussian  Standard  Deviation  =  17.199. 


Range  =  2.91  km,  apl.32813 


Figure  4.1.29  5  Ton  Truck,  Image  13,  Date  3/28/86,  Range  2.91  km.  Image  with  .he  smallest 
aussian  standard  deviation.  Gaussian  Standard  Deviation  =  7.001. 
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drop-out  measurement  or  a  valid  measurement  perturbed  by  Gaussian  noise.  Thus,  after  the 
DOP,  and  GSD  noise  parameters  are  chosen,  the  synthetic  degradation  of  a  synthetic  image 
proceeds  as  follows. 

As  each  on  target  pixel  is  considered  a  random  number  between  0  and  1  is  generated.  If 
this  number  is  less  than  or  equal  to  the  DOP  parameter  selected,  the  pixel  of  interest  is  replaced 
with  a  random  gray  level  between  0  and  255.  Otherwise,  the  pixel  is  additively  corrupted  with  a 
Gaussian  number  of  the  proper  variance.  Figures  4.1.30-4.1.32  show  examples  of  original  and 
noise  corrupted  synthetic  images  at  ranges  of  0.5,  3,  and  5  km. 

Next,  it  is  important  to  determine  how  well  the  synthetic  and  actual  images  compare  To 
perform  this  comparison  the  noise  analysis  developed  above  is  applied  to  actual  tank  images, 
and  the  noise  parameters  extracted  are  used  to  degrade  synthetic  PADL  imagery.  An  obvious 
problem  here  is  the  lack  of  large  planar  tank  surfaces  from  which  to  extract  a  window  of  data. 
The  windows  used  cover  most  of  the  tank  base,  and  because  of  this,  some  of  the  noise  is  actu¬ 
ally  range  variation  of  the  tank  surface.  These  variations  are  not  readily  discernible,  however, 
and  should  not  affect  this  visual  comparison  (see  Figures  4.1.33  -  4.1.35). 

This  last  comparison  provides  some  other  useful  information  as  well.  It  provides  an  excel¬ 
lent  test  of  the  noise  analysis  technique  which  has  been  a  major  portion  of  this  entire  section  on 
noise  characteristics  and  their  simulation.  The  noise  parameters  of  the  actual  imagery  are  calcu¬ 
lated  for  the  indicated  window  and  shown  in  part  (a)  of  the  figure  captions.  These  parameters 
are  then  specified  as  the  desired  characteristics  for  the  synthetic  image.  The  resultant  noisy  syn¬ 
thetic  image  is  then  analyzed  in  the  same  way  and  its  noise  parameters  appear  in  part  (b)  of  the 
figure  captions.  Note  how  well  the  parameters  agree. 

4.I.3.3.  Future  Work 

The  first  attempts  at  synthesizing  LADAR  images  is  very  encouraging.  The  synthetic 
image  look  very  much  like  the  real  images.  One  visual  difference  is  that  the  edges  of  the  syn¬ 
thetic  target  are  sharper  than  the  real  target.  Future  work  will  attempt  to  measure  the  blurred 
edges  in  the  real  images  and  mimic  it  in  the  synthetic  images.  Another  problem  to  be  addressed 
is  to  verify  that  the  synthetic  images  look  the  same  to  the  image  processing  algorithms  as  the 
real  images. 

4.2.  ETBM  via  TWIN 

Section  4.1  presented  our  approach  for  building  an  Electronic  Terrain  Board  Model 
(ETBM)  using  the  PADL  solid  modeler.  This  section  presents  the  additional  capabilities  we 
have  gained  by  moving  the  modeler  to  the  TWIN  Solid  modeling  package  and  show  some 
examples  of  the  detailed  images  that  can  be  produced  by  it.  In  addition,  the  efforts  to  convert 
the  BRL  models  to  TWIN  are  discussed. 


Figure  4.1.30  Clean  and  noisy  images  of  synthetic  PADL  m60al  tank.  Range  =  0.5  km..  Pro¬ 
bability  of  No  Return  =  0.05,  Gaussian  Standard  Deviation  =  12.0. 


Figure  4.1.31  Clean  and  noisy  images  of  synthetic  PADL  m60al  tank.  Range  -  3.0  km..  Pro¬ 
bability  of  No  Return  =  0.05,  Gaussian  Standard  Deviation  =  12.0. 


Figure  4.1.32  Clean  and  noisy  images  of  synthetic  PADL  m60al  tank.  Range  =  5.0  km.,  Pro¬ 
bability  of  No  Return  =  0.05,  Gaussian  Standard  Deviation  =  12.0. 


(b) 

ligurc  4.1.33  Actual  vs.  synthetic  laser  range  imagery,  (a)  Actual:  range  =1.19  km.  Proba¬ 
bility  of  No  Return  =  0.041,  Gaussian  Standard  Deviation  =  14.509.  (b)  Syn¬ 
thetic:  range  =  1.0  km  ’Yobability  of  No  Return  =  0.037,  Gaussian  Standard 
Deviation  =  14.471 . 


(b) 


Figure  4.1.34  Actual  vs.  synthetic  laser  range  imagery,  (a)  Actual:  range  =  1.935  kn.,  Proba¬ 
bility  of  No  Return  =  0.027,  Gaussian  Standard  Deviation  =  8.765.  (b)  Syn¬ 
thetic:  range  =  2.0  km,  Probability  of  No  Return  =  0.019,  Gaussian  Standard 
Deviation  =  8.719. 


(b) 


Figure  4.1.35  Actual  vs.  synthetic  laser  range  imagery,  (a)  Actual:  range  =  2.91  km.  Proba¬ 
bility  ot  No  Return  =  0.018,  Gaussian  Standard  Deviation  =  6.576,  (b)  Syn¬ 
thetic:  range  =  3.0  km,  Probability  of  No  Return  =  0.0i8,  Gaussian  Standard 
Deviation  =  6.426. 
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4.2.1.  The  TWIN  Solid  Modeling  Package 

The  Electronic  Terrain  Board  Model,  presented  in  Section  4.1,  was  built  upon  the  PADL 
[Padl]  solid  modeling  system.  Synthetic  range  images  of  targets,  terrain,  and  clutter  were  all 
produced  using  PADL.  However,  PADL  was  designed  to  model  industrial  parts  and  not  terrain 
boards,  and  therefore  is  unable  to  handle  the  large  number  of  objects  needed  to  synthesize  a 
complicated  scene.  We  have  switched  to  the  TWIN  Solid  Modeling  Package  to  overcome  the 
size  limitation  problems  and  gain  more  flexibility  in  the  modeler.  By  doing  so  we  have  gained  a 
package  that  can  serve  both  as  a  terrain  board  modeler  and  a  data  structure  for  model  based  tar¬ 
get  recognition.  TWIN  is  a  boundary  representation  solid  modeler  which  supports  planer  sur¬ 
faces  and  is  available  from  the  Purdue  CADLAB  [CAD]  as  a  library  of  C  routines.  Appendix  D 
is  a  partial  listing  of  the  routines  in  the  package.  By  interfacing  these  TWIN  routines  to  our 
Lisp  image  processing  environment,  planer  boundary  representation  (BRep)  objects  can  be 
created  from  the  same  types  of  primitives  as  used  by  PADL,  that  is,  the  CSG  (constructive  solid 
geometry)  primitives.  Therefore  most  models  used  in  PADL  can  be  easily  converted  to  TWIN 
models.  The  next  paragraph  describes  how  an  M60A 1  TWIN  model  is  built. 

Figure  4.2.1  is  a  partial  list  of  the  points,  edges,  and  surfaces  (from  ERIM)  which  describe 
the  M60A1  tank.  (Figure  4.2.1  is  the  same  as  Table  4.1.1.)  The  Lisp  data  for  Figure  4.2.1  is 
shown  in  Figure  4.2.2.  Each  symbol,  p# ,  is  a  point  which  is  a  list  of  X,  Y,  Z  coordinates  (in 
meters).  Due  to  large  number  of  points  (and  edges  and  surfaces)  in  the  M60A1  model,  only  the 
first  few  points  (or  edges  or  surfaces)  are  show  in  the  figure.  Each  symbol,  e# ,  is  an  edge  which 
is  an  ordered  list  of  two  points.  Each  symbol  s ft  is  a  surface  which  is  bounded  by  the  edges  in 
the  list.  Normal  edges  are  traversed  from  the  first  point  to  the  second.  Negative  edges  are 
traversed  from  the  second  point  to  the  first.  Finally,  m60al-all  a  list  of  all  the  surfaces  which 
bound  the  M60A1  model.  Figure  4.2.3  is  program  for  creating  an  M60A1  TWIN  object  given 
the  m60al-all  data.  The  first  setq  calls  the  function  object-to-twin  which  converts  the  object 
described  by  m60al-all  to  a  TWIN  object.  The  second  setq  defines  the  main  gun  of  the  tank  as 
a  cylinder  which  is  a  primitive  TWIN  object.  The  Cylinder  routine  is  given  the  X,  Y,  Z  loca¬ 
tion  of  the  base  of  the  cylinder,  the  X,  Y,  Z  direction  of  the  axis  of  the  cylinder,  the  radius  of  the 
cylinder,  and  finally  the  number  of  facets  to  use  to  approximate  the  cylinder.  The  size  and  loca¬ 
tion  of  the  cylinder  are  described  symbolically  in  terms  of  the  points  defined  in  the  ERIM 
wireframe  modei.  (i.e.  pl91 ,  pi 86 ,  and  pi 89  are  ail  points  from  the  M60A1  model.)  Once  the 
body  and  the  main  gun  are  defined,  they  are  unioned  together  into  one  object  using  the  Com¬ 
bine  function.  Therefore  the  object  returned  by  make-m60al  is  the  union  of  the  body  and  the 
gun.  This  code  is  much  simpler  than  the  code  needed  to  describe  the  same  tank  in  PADL.  (See 
Figure  4.1.2).  Once  the  TWIN  object  is  defined,  a  synthetic  image  can  be  defined  from  Lisp  by 
entering: 

(render  m60al  :ifov  0.05  : range  1.0  :size  '(200  100)) 

which  will  generate  a  200  by  100  pixel  range  image  of  an  M60A1  at  one  kilometer  with  a  reso¬ 
lution  0.05  mrad.  Figure  4.2.4  is  the  output  of  the  render  program. 
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Figure  4.2.1  Partial  listing  of  points,  edges,  and  surfaces  for  the  wireframe  model  on  an 
M60A1.  (Al!  dimensions  are  in  meters.) 
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M60  POINT  FILE 


9  9  9 


9  9  9 

X 

Y 

Z  (in  meters) 

(setq  pi 

'  (  3.15 

1.81 

1.6)  ) 

(setq  p2 

'  (  3.15 

-1.81 

1.6)  ) 

(setq  p3 

'  (-3.25 

1.81 

1.6)  ) 

(setq  p4 

'  (-3.25 

-1.81 

1.6)  ) 

(setq  p5 

'  (-3.47 

1.81 

1.4)  ) 

;;;  M60  EDGE 

•  •  • 

9  9  9 

•  •  • 

•  •  r 

FILE 

pointl 

point2 

(setq  el 

'  (pl 

p  3 )  ) 

(setq  e2 

'  (p3 

p9) ) 

(setq  e3 

'  (p9 

pl5)  ) 

(setq  e4 

'  (pl5 

pl7)  ) 

(setq  e5 

'  (pl7 

p21)  ) 

;;;  M60  SURFACE  FILE 


•  •  • 

9  9  9 

9  9  9 

Edge  numbers 

(setq 

si 

'  (el  e2  e3  e4  e5  e6  el  e8  e9 

elO  ell  el2  el3  el4  el5  el6 

el7  el8) )  ;  HULL  TOP 

(setq 

s2 

'  (  el9  e-18  e20  e21) )  ; 

r  LEFT  FENDER  TOP 

(setq 

s3 

' (  e22  e-14  e23  e24) )  ; 

:  RIGHT  FENDER  TOP 

(setq 

s4 

' (  e25  e26  e-20  e-17) )  ; 

•  LEFT  INSIDE  FENDER 

(setq 

s5 

' (  e27  e-15  e-22  e28) )  ; 

:  RIGHT  INSIDE  FENDER 

;;;  The  following  describes  the  surfaces  that  bound  an  m60al. 

; ; ;  These  are  the  same  surfaces  described  in  the  m60al  wireframe  data 
;;;  from  ERIM. 

(setq  m60al-all  '  (si  s2  s3  s4  s5  s6  s7  s8  s9  slO  sll  sl2  sl3  sl4 

sl5  sl6  sl7  sl8  sl9  s2 0  s21  s22  s23  s37  s38  s39  s40 
s65  s66  s67  s24  s25  s26  s27  s28  s29  s30  s31  s32  s33 
s34  s35  s36) ) 

Figure  4.2.2  LISP  data  for  the  M60A 1  model  described  in  Figure  4.2. 1 . 


;;;  This  will  create  an  ra60al  as  a  TWIN  object. 

(defun  make-ra60al  () 

(setq  body  (object-to-twin  'm60al-all)) 

(setq  gun  (Cylinder 

(x  pl91)  ;  x  location  of  base 

(/  (  +  (y  pl91)  (y  pl86))  2.0)  ;  y  location  of  base 

(/  (+  (z  pl91)  (z  pl86))  2.0)  ;  z  location  of  base 

(-  (x  pl89)  (x  pl91))  ;  x  Direction  of  cylinder 

0-0  ;  y  Direction  of  cylinder 

0-0  ;  z  Direction  of  cylinder 

(/  (-  (z  pl87)  (z  pl91))  2d0)  ;  Radius 
8)))  ;  Number  of  facets  to  use. 

(Combine  body  #+  gun) ) ) 

Figure  4.2.3  LISP  program  to  generate  an  M60A1  TWIN  tank  model. 


200  by  100 


riguic  4.2.4 


Ren  dered  M60al. 
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The  TWIN  Solid  Modeling  Package  has  proven  to  be  much  easier  to  work  with  and  much 
faster  than  the  PADL  modeler.  It  is  easier  to  work  with  because  all  the  routines  are  written  in 
C,  unlike  PADL  which  is  written  in  a  non-standard  dialect  of  FORTRAN.  This  makes  the  TWIN 
code  easier  to  read  and  understand  than  PADL  code.  TWIN  is  faster  for  a  number  of  reasons. 
One,  we  are  able  to  control  the  size  of  the  images  being  rendered,  i.e.  if  a  51 1  by  256  image  is 
needed,  that  is  the  size  we  render.  PADL  on  the  other  hand  was  designed  to  render  an  image 
the  size  of  the  display  device  it  is  using,  therefore  if  a  51 1  by  256  image  is  wanted,  a  512  by  480 
image  must  be  rendered,  and  then  trimmed  to  the  desired  size.  Reason  two  is  that  TWIN 
approximates  non-planar  objects  such  a  spheres  and  ellipses  as  being  planar.  This  allows  a  sim¬ 
ple  scan  line  rendering  algorithm  to  be  used.  PADL  does  not  approximate  non-planar  object 
and  uses  a  more  time  consuming  ray  tracing  algorithm  to  render.  Although  there  may  be  cases 
where  approximations  cannot  be  used,  our  application  can  use  them.  Finally,  TWIN  can  "com¬ 
pile"  CSG  primitives  into  a  complex  object  once  and  save  that  object.  Subsequent  uses  of  the 
object  do  not  require  recompiling.  PADL,  one  the  other  hand,  has  no  method  for  saving  the 
resulting  compilation  of  CSG  primitives.  Instead,  it  must  rebuild  the  object  each  time  PADL  is 
restarted.  Some  of  the  BRL  objects  have  as  many  as  5,000  primitives.  Substantial  times  sav¬ 
ings  will  be  realized  if  such  an  object  can  be  compiled  only  once  and  the  resulting  boundary 
representation  used  over  and  over  again. 

Section  3.4.2  of  this  report  has  shown  how  geometric  features  are  read'ly  extracted  from 
the  TWIN  modeler.  The  following  section  describes  ho  .v  a  LADAR  sensor  is  simulated. 

4.2.2.  The  Electronic  Field  Test 

The  four  basic  elements  needed  to  run  an  electronic  field  test  are  the  terrain,  the  targets, 
clutter,  and  sensor  noise.  This  section  describes  how,  using  our  TWIN  based  modeler,  these 
four  elements  are  combined.  The  actual  steps  are: 

Start  with  a  2D  elevation  array 
Create  a  terrain  patch 
Select  the  targets  for  the  image 
Place  the  targets  on  the  patch 
Render  the  image 
Generate  ground  truth  data 
Convert  to  32  bit  integer 
Create  range  ambiguities 
Blur  the  image 
Add  noise 

Each  of  the  following  sections  discusses  each  step. 
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4.2.2.I.  Create  a  2D  Elevation  Array 

The  first  step  in  an  electronic  field  test  is  to  design  the  terrain  for  the  test.  This  is  done  by 
creating  a  two  dimensional  array  of  elevations  over  an  evenly  spaced  grid.  If  "real"  terrain  data 
is  available,  it  can  be  used,  otherwise  fractal  based  methods  can  be  used  to  generate  the  data. 
Here  is  the  LISP  code  to  make  such  an  array: 

(setq  elevation  (make-array  '  (17  17) ) 

This  creates  the  storage  needed  for  the  elevation  array.  The  following  data  was  generated  using 
the  fractal  techniques  presented  in  Section  4. 1,2.2. 
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4.2.2.2.  Create  Terrain  Patch 

The  next  step  is  to  create  the  terrain  patch  from  the  elevation  data.  This  is  done  with  the 
following  LISP  command: 

(setq  patch  (create-patch  elevation 

:extent  '  ((-25  -25  -1) 

(25  25  0)))) 

The  :  extent  keyword  is  used  to  give  the  size  of  terrain  patch.  The  above  command  sets  the 
minimum  x  and  y  values  to  -25  meters  and  the  maximum  x  and  y  values  to  25  meters,  thus 
creating  a  50  by  50  meter  patch.  The  minimum  z  value  is  -1  meter  and  the  maximum  is  0 
meters.  Changing  the  relative  extent  of  the  z  value  affects  the  "ruggedness”  of  the  patch.  Fig¬ 
ure  4.2.5  is  a  rendering  of  the  above  patch. 


Figure  4.2.5  Sample  terrain  patch  created  with  fractals. 
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4.2.2.3.  Select  Targets 

The  next  step  is  to  select  the  targets  to  be  used  in  the  field  test.  Figure  4.2.6  shows  the  four 
targets  currently  available.  A  target  can  appear  any  number  of  times  in  a  scene  and  any  orienta¬ 
tion.  Section  4.2.3  discusses  our  efforts  to  convert  the  BRL  targets  to  TWIN  which  will 
increase  the  selection. 

4.2.2.4.  Place  targets  on  patch 

The  next  step  is  to  describe  the  location  and  orientation  of  each  of  the  targets  on  the  patch. 
The  following  code  places  five  targets  on  the  patch: 

(setq  tanks  (place-ob jects-on-patch  patch 

' ( (m60al  0.0  (00)) 


(m60al 

90.0 

(10 

10)) 

(m35 

180 

(-6 

6)  ) 

(m35 

270 

(-10 

-10)  ) 

(mll3 

35 

(6  - 

6))))) 

The  targets  are  places  in  the  following  locations: 


Target 

Rotation  about  Z 
(in  degrees) 

Location 
(in  meters) 

M60A1 

0.0 

0  0 

M60A1 

90.0 

10  10 

M35 

180.0 

-6  -6 

M35 

270.0 

-10  -10 

Ml  13 

35.0 

_ L 

6  -6 

Knowledge  about  the  elevations  of  the  given  locations  on  the  patch  are  used  to  automatically 
select  the  correct  Z  elevation. 

4.2.2.5.  Render 

Rendering  the  above  scene  with  the  following  command  will  create  four  images:  range, 
shading,  edges,  and  faces. 

(setq  images  (render  tanks 

.•shading  "all" 

:camfrom  ' (  10  4  12) 

:camto  '  (—10  -4  8) 

: range  1 . 0 
: size  '  (512  480) ) ) 


Figure  4.2.6  Target  models  available  in  TWIN  modeling  system. 
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The  rendering  routine  can  render  the  scene  from  any  viewpoint  (:camfrom)  to  and  viewpoint 
( :  camto)  for  any  given  range  and  sensor  resolution  ( :  range  :  resolution).  Figures  4.2.7  - 
4.2.10  show  the  resulting  images. 

4.2.2.6.  Display  Ground  Truth 

The  next  step  is  the  display  the  ground  truth  table.  The  ground  truth  table  lists  every  sur¬ 
face  in  the  scene  and  its  location  and  orientation  relative  the  the  viewing  plane.  The  following 
command  prints  the  table: 

(print-ground-truth  tanks) 

Figure  4.2.1 1  is  a  partial  listing  of  the  ground  truth  of  the  sample  scene. 

4.2.2.7.  Convert  to  32  bit  integer 

Up  to  this  point,  the  pixels  in  the  range  images  are  floating  point  values  which  are  the  dis¬ 
tance  from  the  Z  axis  in  the  model  space.  The  following  routine  converts  these  values  to  32  bit 
integers  at  the  requested  distance.  Since  the  units  in  the  model  space  are  meters  and  a  scale  of 
100  is  used,  the  result  units  for  the  range  image  is  centimeters. 

(setq  range  (range-to-int  (first  images) 

:offset  1000 
: scale  100) ) 


4.2.2.8.  Create  range  ambiguities 

As  presented  in  previous  reports,  taking  the  mod  of  each  pixel  1875  will  create  range 
ambiguities  in  the  image  which  simulate  the  ambiguities  of  the  fine  range  channel  of  the  sensor. 

(setq  range  (mod-image  range  1875) ) 

This  is  shown  in  Figure  4.2. 12. 

4.2.2.9.  Blur  the  image 

The  following  code  will  blur  the  image  by  replacing  each  pixel  wifh  the  average  of  the 
pixels  around  it.  Although  this  does  blur  the  image,  it  most  likely  does  not  simulate  the  blurring 
of  a  real  LADAR  sensor.  More  information  is  needed  about  the  sensor  before  its  blurring  can 
be  mimiced. 

(setq  range  (blur-image  range) ) 

This  is  shown  in  Figure  4.2.13. 


Figure  4.2.7  Range  image  of  scene  created  in  See  on  4.2.2.S. 


Figure  4.2.8  Shaded  image  of  scene  created  in  Section  4. 2. 2. 5. 


igurc  4.2.9  Edge  image  of  scene  created  in  Section  4. 2. 2. 5 


Figure  4.2.10  Faces  image  of  scene  created  in  Section  4.2. 2.5.  The  grey  values  in  the  faces 
image  index  into  the  giound  truth  table. 


group 

Area 

min-ext 

max -ext 

centroid 

normal 

2 

295.70 

(198.25,228.54,-14.60) 

(313.01,255.14,-7.44) 

(256.8253  242.3849  -10.871665) 

(-0.00,-0.98, 0.18) 

3 

46.88 

(245.79,254.18,-7.70) 

(261.35.260.16.-7.18) 

(253.5674  257.1697  -7.43846) 

(0.20,-0.74.  0.64) 

4 

46.88 

(299.82.250.23,-8.76) 

(315.39,256.21,-8.24) 

(307.60474  253.22339  -8.501042) 

( 0.20,-0.74,  0.64) 

5 

24.56 

(261.35,264.69,-7.44) 

(257.80124  257.08667  -7.871617) 

( 0.93,-0.07,-0.37) 

6 

24.56 

(291.28,247.30,-9.55) 

(302.20.261.71,-8.24) 

(298.6542  254.10316  -8.674944) 

(-0.93,  0.07,  0.37) 

7 

350.26 

(250.43,247.30,-9.55) 

(301.31,264.69,-7.60) 

(275.86942  255.99329  -8.57681) 

(0.13,-0.86,  0.49) 

8 

169.45 

(243.04,259.20.-8.26) 

(261.35,278.07,-7.18) 

(253.84595  267.84653  -7.69662) 

(  0.26,  0.69,  0.67) 

9 

23.77 

(237.10,277.11,-9  06) 

(256.22,282.64,-8.00) 

(246.65952  279.87466  -8.527515) 

mMtxsamrn 

10 

154.27 

(315.39.274.13.-8.24) 

(305.24637  264.09274  -8.707351) 

( 0.35,  0.68,  0.64) 

11 

23.77 

(291.13,273.17,-10.12) 

(310.26.27S.69.-9.06) 

(300.69684  275.92838  -9.590096) 

(0.16,0.96,  0.22) 

12 

326.18 

(256.22,261.71,-9.06) 

(301.31,277.11,-7.60) 

(278.76627  269.40875  -8.333074) 

13 

22.62 

(196.61,232.48,-13.78) 

(211.43.236.63,-13.28) 

(204.02386  234.5563  -13.527309) 

(-0.25,-0.84,-0.48) 

14 

22.62 

(250.65,228.54,-14.84) 

(265.47,232.68.-14.34) 

(258.06122  230.60999  -14.589891) 

(-0.25,-0.84,-0.48) 

15 

241.65 

H;  196. 61, 235. 67, -13. 78) 

(211.43,256.06,-13.49) 

(204.02386  245.86302  -13.632289) 

(-0.36,  0.06,-0.93) 

16 

105.13 

(198.25,255.09,-13.75) 

(216.93,267.42,-12.90) 

(207.58922  261.25507  -13.321914) 

17 

241.65 

(258.06122  241.9167  -14.694872) 

18 

105.13 

(252.29,251.15,-14.81) 

(270.97,263.47,-13.96) 

1261.6266  257.30872  -14.384494) 

(-0.19,0.75,-0.63) 

19 

1036.93 

(211.43,221.63.-14.52) 

(252.29,252.15,-13.47) 

(231.85962  234.92247  -13.97644) 

20 

8.94 

r  (209.80,232.48,-13.78) 

(211.43,255.09,-13.54) 

(21 1.02458  243.84747  -13.695734) 

(  0.93,-0.07,-0.37) 

21 

8.94 

(250.65,229.50.-14.58) 

(252.29.252.11.-14.34) 

(251.87758  240.86398  -14.499061) 

(-0.93,  0.07,0.37) 

22 

49.47 

■MIE 

(224.43,238.41,-11.87) 

1  (217.93243  231.51517  -12.702987) 

125.77 


49.47 


701.11 


701.11 


104.14 


104.14 


184.44 


503.58 


184.44 


787.80 


243.20 


66.82 


206.69 


433.44 


157.01 


83.23 


456.76 


21.24 


(211.43.221.63.-14.27) 


(252.29,221.63.-14.34) 


(263.84.228.54,-14.84) 


(196.61.233.44.-13.52) 


(203.75.266.45.-13.15) 


(257.78,262.51,-14.22) 


(252.29.249.16,-14.55) 


(211.43,249.16,-14.52) 


(265.28.230.55,-11.87) 


(265.28.235.43.-12.67) 


(315.39.277.73,-8.50) 


(248.16.282.64,-7.18) 


(250.28,282.64,-8.80) 


(297.08,277.11,-8.26) 


MMWMmkiMvmmwmhwmsMnbm 

■■■■■■I 


(264.36,207.97,-12.80) 


(277.63,214.51,-11.04) 


(258.41,235.32,  9.62) 


(238.35895  226.09067  -13.068134) 


(258.78543  228.53171  -13.506313) 


(288.58954  253.78618  -1 1.896009) 


(221.36783  258.69534  -10.574172) 


(227.01295  274.5451  -10.975836) 


(281.0503  270.5988  -12.038416) 


(270.11285  263.51236  -12.390967) 


(254.25447  263.1367  1 1.391 1915) 


(229.25986  266.4958  -1 1 .58764 1 ) 


(244.85826  228.07399  -12.260244) 


(273.8639  221.74771  -1 1.827543) 


(281.32422  234  65791  -10  214957) 


[  (0.00,-0.98,0.18) 


( 0.93,-0.07,-0.37) 


(0.93,-0.07,-0.37) 


(-0.93,  0.07,  0.37) 


(-0.00,  0.98,-0.18) 


(-0.00.  0.98,-0.18) 


(-0.93.  0.07,  0.37) 


(-0.01.0.98,-0.22) 


( 0.93.-0.07,-0.37) 


(-0.37,-0.17,-0.91) 


( 0.88,-0.22,-0.42) 


T  (0  96,-0  06,  0.26) 


(221.65,210.82,-11.89) 


(225.36,207.97,-12.47) 


(236.87,214.51,-10.71) 

(247.27,207.18.-11.71) 


(259.62,248.52,-9.13) 


(236.87,245.64,-9.81) 


(268.07,217.84,-9.81) 


(282.55.236.63,-9.13) 

(257.49,216.40,-10.17) 


(268.6273  238.39363  -9.334141) 

[  (  0.37,  0.17,  0.9 1  j 

(247.017  237.15837  -9.540345) 

(-0.53,0.05.0.84) 

(229.25987  228.22997  -10.848946) 

(-0.95,-0.09.0.30) 

(249.74825  212.18248  -11.335811) 

(0.00.0.98,0.18) 

(263.64447  226.1 1214  -9.776019) 
(253.07529  211.34956  -10.820647) 

Figure  4.2. 1 1  Partial  list  of  surfaces  in  Figure  4.2. 1 0. 
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4.2.2.10.  Add  noise 

Finally  noise  is  added  to  the  image  which  matches  the  noise  measured  in  real  LADAR 
images.  The  noise  added  here  is  the  same  as  that  described  in  Section  4.3.2  of  |  KaYo88], 

(setq  range  (add-noise  range  : sky  275)) 

Since  the  noise  characteristics  of  a  no-retum  signal  differ  from  the  characteristics  of  background 
clutter,  the  value  of  the  sky  is  passed  to  the  add-noise  routine  so  it  knows  when  to  add  the  no- 
retum  type  noise.  This  is  shown  is  Figure  4.2. 14. 

4.2.2.11.  Future  Work 

Use  better  blurring,  based  on  real  data.  Study  the  new  LADAR  data  to  know  what  kind  of 
noise  to  add.  Add  clutter. 

4.2.3.  Conversion  of  BRL  Objects  to  TWIN  Objects 

In  order  to  convert  solid  objects  from  the  representation  used  by  the  BRL  modeler  to  one 
compatible  with  the  TWIN  library,  one  must  first  understand  how  complex  objects  are 
represented  in  BRL  modeler.  The  BRL  modeler  represents  solid  objects  via  CSG  trees,  much  as 
PADL  does.  Figure  4.2.15  shows  a  simple  CSG  tree  that  could  be  used  to  define  a  solid  object. 
In  this  figure,  the  oval  nodes  in  the  tree  represent  primitive  objects;  these  objects,  such  as 
spheres,  cubes,  etc.,  are  the  elemental  objects  used  by  the  BRL  modeler.  The  rectangular  nodes 
in  the  tree  represent  solids  created  by  the  boolean  combination  of  lower-level  objects.  As  in 
most  CSG  systems,  BRL  objects  can  be  combined  using  three  regularized  boolean  operations: 
union  (u),  intersection  (i),  and  difference  (-).  A  rigid  body  transformation  (rotation,  translation) 
is  applied  to  each  of  the  low-level  objects  before  they  are  combined  to  form  the  parent  object. 
These  transformations  are  represented  as  homogeneous  transformation  matrices  (indicated  in 
the  figure  with  the  symbols  Tj  and  /). 

There  are  two  sub-tasks  that  must  be  performed  to  convert  solid  objects  from  the  represen¬ 
tation  used  in  the  BRL  modeler  to  that  used  by  the  TWIN  library.  First,  TWIN  objects  that 
represent  BRL  primitive  objects  must  be  generated.  These  objects  are  generated  while  the  BRL 
model  is  being  read  inm  the  system.  When  a  BRL  primitive  is  encountered  in  the  definition  for 
a  solid,  the  appropriate  library  subroutine  is  called  immediately  and  the  TWIN  structure 
representing  the  primitive  is  generated.  After  TWIN  structures  for  all  the  BRL  primitive 
objects  have  been  generated,  they  are  combined  into  increasingly  complex  objects.  The  follow¬ 
ing  is  a  list  of  the  primitive  solids  generated  by  the  BRL  modeler. 

ARBS  An  ARB8  is  a  solid  with  8  arbitrarily  placed  vertices.  This  primitive  solid  is 

used  to  represent  such  objects  as  cubes,  parallelpipeds  and  wedges.  The  BRL 
modeler  also  uses  the  ARB8  structure  to  represent  primitive  objects  with  less 
than  8  vertices  by  setting  the  coordinates  of  some  of  the  8  vertices  to  the  same 
point  in  3-space.  For  example  in  a  7  vertex  solid,  the  coordinates  of  2  vertices 


Figure  4.2.15  BRL  CSG  tree. 
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would  be  set  to  the  same  point. 

TGC  The  TGC  (truncated  generalized  cylinder)  primitive  is  used  to  represent 

cylinders,  cones,  and  elliptical  cones. 

ELLG  The  ELLG  (generalized  ellipsoid)  primitive  is  used  to  represent  spheres,  ellip¬ 

soids  and  ellipsoids  of  revolution. 

TOR  The  TOR  primitive  is  used  to  represent  toroidal  solids. 

Most  of  the  primitive  solids  listed  here  have  equivalent  TWIN  counterparts  and  thus  can 
be  generated  directly  by  TWIN  subroutine  calls;  however,  a  number  of  difficulties  were  encoun¬ 
tered  when  implementing  some  of  the  BRL  primitives  in  TWIN.  For  example,  there  was  no 
subroutine  in  the  TWIN  library  to  generate  solids  with  8  arbitrary  vertices.  We  have  been  able 
to  overcome  this  difficulty  by  writing  our  own  routines  to  create  TWIN  objects  with  8  arbi¬ 
trarily  specified  vertices.  Also,  the  definition  for  a  truncated  generalized  cylinder  used  in  the 
BRL  package  is  slightly  more  general  than  that  used  in  the  TWIN  library.  We  have  imple¬ 
mented  the  TGC  using  the  slightly  less  general  TWIN  definition;  if  a  TGC  requiring  the  more 
general  BRL  definition  is  encountered,  a  warning  message  is  printed.  So  far,  we  have  never 
encountered  a  TGC  object  that  required  the  more  general  definition.  At  this  time,  we  have  not 
implemented  the  code  to  generate  TWIN  solids  from  ARB8’s  with  less  then  8  distinct  vertices, 
but  this  should  not  present  any  major  difficulties. 

After  the  primitive  TWIN  solids  are  generated,  they  must  be  combined  into  more  complex 
solids  as  defined  by  the  structure  of  the  object’s  CSG  tree.  During  the  conversion  process,  a 
post-order  traversal  of  the  CSG  tree  is  performed,  and  a  TWIN  solid  is  created  for  eacn  node  as 
it  is  visited.  Because  each  node  is  visited  after  all  of  its  children  are  visited  in  a  post-order  tree 
traversal,  the  TWIN  solids  corresponding  with  a  node’s  children  are  creat.  J  before  they  are 
needed  to  be  combined  into  the  parent  node. 

We  have  encountered  problems  when  combining  TWIN  objects  into  more  complex  enti¬ 
ties;  specifically,  the  TWIN  combination  routines  sometimes  produce  invalid  objects  when  two 
valid  objects  are  combined.  We  have  determined  that  the  tolerance  problem  is  the  cause  of  this 
behavior.  Tolerancing  is  a  fundamental  problem  of  solid  modeling  that  must  be  overcome,  to 
some  extent,  in  all  solid  modeling  systems.  Two  of  the  most  obvious  manifestations  of  the 
tolerance  problem  are  the  following:  When  is  a  point  inside  the  object  and  when  is  it  outside? 
When  should  two  points  be  considered  to  be  equivalent?  The  tolerance  problem  also  arises  in 
many  more  subtle  ways  when  combining  solid  models.  Obviously,  if  the  modeler  cannot  solve 
these  problems,  then  it  cannot  produce  correct  results.  At  the  current  time,  the  TWIN  library  is 
not  able  to  overcome  many  of  these  tolerance  problems;  however,  the  TWIN  library  is  evolving 
rapidly  and  thus  may  encorporate  more  sophisticate  tolerancing  schemes  in  the  future. 
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4.2.4.  Conclusions 

The  conversion  to  the  TWIN  solid  modeler  has  been  successful.  We  are  now  able  to 
model  more  complex  scenes  than  possible  with  PADL.  Many  problems  have  been  overcome  in 
converting  the  BRL  targets  into  TWIN  targets.  Unfortunately  the  tolerance  problem  is  one  that 
hasn’t  been  solved  at  this  time.  Improvements  in  TWIN  could  overcome  this  problem. 
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APPENDIX  A:  UTILITIES 


A.l.  A  DATABASE  FOR  MANAGING  TARGET  IMAGES 

After  acquiring  the  Terrain  Board,  Eglin  Turntable,  and  BRITT  target  images  from  NVL, 
we  found  that  we  had  a  rich  variety  of  images.  These  images  contained  a  wide  selection  of  tar¬ 
gets  viewed  from  many  ranges  and  angles.  Most  images  had  a  header  which  contained  informa¬ 
tion  on  the  targets  in  the  image  (information  such  as  ranges,  orientation,  etc),  and  the  prevailing 
environmental  conditions.  However,  selecting  a  set  of  targets  for  a  given  experiment  was 
difficult  because  the  header  formats  were  not  the  same  for  all  the  images.  In  addition,  most 
headers  were  hundreds  of  bytes  long  and  contained  extra  information  which  was  not  needed. 
The  experiments  (classification  in  particular)  required  the  selection  of  several  targets  with  simi¬ 
lar  characteristics  For  example,  an  M35  truck  as  viewed  from  2.5  km  and  100  feet  altitude. 
Finding  such  a  set  of  targets  is  difficult  and  time  consuming  since  hundreds  of  headers,  each 
consisting  of  hundreds  of  bytes  of  information,  must  be  scanned. 

To  resolve  this  problem  we  extracted  the  pertinent  information  from  each  of  the  headers 
and  then  used  a  relational  database  manager  to  help  select  the  different  sets  of  targets  needed 
for  the  experiments. 

A  generic  relational  database  is  described  in  the  following  section.  Section  1.2  gives 
details  on  what  information  was  extracted  from  the  headers  and  stored  in  the  database.  Finally, 
Section  1.3  gives  examples  of  how  to  use  the  UNIFY  Relational  Database  Management  System 
[UNIFY],  to  locate  target  of  a  given  description  in  the  database. 

A.1.1.  RELATIONAL  DATABASES 

Briefly,  a  relational  system  is  one  in  which: 

1.  the  data  is  perceived  by  the  user  as  tables  (and  nothing  but  tables);  and 

2.  the  operators  at  the  user’s  disposal  (e.g.,  for  data  retrieval)  are  operators  that  generate  new 
tables  from  old.  [Date86] 

For  example,  there  will  be  one  operator  to  extract  a  subset  of  the  rows  of  a  ,  en  table,  and 
another  to  extract  a  subset  of  the  columns  -  and  of  course  a  row  subset  and  a  column  subset  of  a 
table  may  both  in  turn  be  regarded  as  tables  themselves.  New  tables  may  be  permanently  saved 
as  part  of  the  database  or  merely  displayed  on  a  terminal. 

Each  permanent  table  (or  relation,  as  it  is  called)  is  given  a  unique  name.  Tne  BRITT  tar¬ 
get  data  has  been  stored  in  a  table  named  britt;  Terrain  Board  and  Eglin  Turntable  data  will 
similarly  be  stored  in  their  own  tables.  Each  of  the  n  columns  of  a  table  has  a  unique  name 
called  an  attribute .  Each  row  of  a  table  is  called  a  tuple ,  short  for  n-tuple,  which  is  made  up  of 
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n  fields  containing  the  attribute  values.  A  tuple  is  not  allowed  to  have  a  null  or  blank  value  for 
one  of  its  attributes.  Each  table  must  have  a  key  attribute  or  set  of  attributes  whose  value  or 
values  uniquely  identify  a  tuple  in  the  relation.  For  example,  if  there  is  a  single  key  attribute, 
then  no  two  tuples  in  the  table  may  have  the  same  value  for  that  attribute. 

A.  1.2.  DATABASE  DESIGN 

Table  A.l  gives  a  description  of  the  attributes  stored  in  each  tuple  of  the  britt  database. 

Each  tuple  in  the  britt  relation  represents  a  target  and  its  associated  information.  Figure 
A.l  is  a  listing  of  a  portion  of  the  britt  database  which  shows  the  values  of  the  various  attri¬ 
butes.  Similar  table  definitions  will  be  made  for  the  rest  of  our  target  data. 

A.1.3.  QUERYING  THE  DATABASE 

After  a  database  is  built,  most  any  database  manager  can  be  used.  We  choose  to  use  the 
UNIFY  Relational  Database  Management  System  [UNIFY]  since  it  is  readily  available  and  sup¬ 
ported  on  the  location  computer  system.  UNIFY  uses  the  SQL  [SQL]  (pronounced  SEQUEL) 
query  language  developed  by  IBM  in  conjunction  with  DML,  the  Data  Manipulation  Language. 

A  SQL  query  consists  of  clauses,  each  of  which  begins  with  a  keyword.  The  following  is  a 
list  of  SQL/D  ML  keywords: 


and 

help 

separator 

asc 

in 

set 

avg 

inseit 

start 

between 

into 

sum 

by 

is 

unique 

count 

lines 

unlock 

delete 

max 

update 

desc 

min 

where 

edit 

not 

write 

end 

or 

fields 

order 

from 

records 

group 

restart 

having 

select 

Since  these  commands  are  described  in  the  UNIFY  manuals  we  will  only  briefly  discuss  those 
commands  needed  to  extract  the  targets  for  the  BRITT  classification  experiments. 

There  are  required  and  optional  SQL  clauses.  The  required  clauses  are  as  follows: 


Table  A.  1  Attributes  of  britt  database. 


id: 

The  key  attribute  of  the  relation.  Every  target  is 
assigned  a  unique  three  digit  i.d.  number. 

type: 

The  type  of  vehicle.  In  this  case,  one  of  APC, 
JEEP,  TANK,  or  TRUCK. 

model: 

The  specific  kind  of  vehicle.  For  example,  there 
were  two  kinds  of  TANKs,  M48  and  M551. 

range: 

The  distance  to  the  center  of  field  of  view  of  the 
parent  image  from  which  the  target  image  was 
extracted. 

angle: 

Aspect  angle  from  which  the  target  is  seen. 

size: 

The  extracted  target  image  is  size  by  size  square. 

target: 

Name  of  file  containing  the  target  image  with  the 
target  of  interest  in  its  center.  This  image  is 
extracted  from  the  provided  image  with  multiple 
targets  in  it.  The  britt  target  files  have  been  named 
britt###,  where  ###  is  the  target  id. 

parent: 

Name  of  the  file  from  which  the  target  was 
extracted. 

x: 

x-coordinate  in  the  parent  file  of  the  center  of  the 
target  file. 

y: 

y-coordinate  in  the  parent  file  of  the  center  of  the 
target  file. 

idltype  Imodel! 

range)  anglel  sizeltarget  Ipareni  1 

xl  y 

liTANK  IM551  1 

5.000001 

180)  128lbriu001 

IfilcOOl 

1  4151  264 

2ITANK  IM551 1 

5.000001 

1801  128lbriti002 

IfileOOl 

1  4ia  290 

3IAPC  IM113I 

5.000001  1 

1801  128lbriti003 

IfilcOOl 

1  1801  236 

4IAPC  IMU4I 

5.00000)  ] 

1801  128lbriu004 

IfilcOOl 

1  3031  225 

5ITRUCK IM35  1 

5.000001 

180  128lbritt005 

IfilcOOl 

1  2891  308 

6UEEP  IM151 1 

5.00000) 

0)  128lbritt006 

IfilcOOl 

.  3391  154 

7ITANK  IM551 1 

5.000001 

180  128lbritt007 

Ifile002 

1  4681  299 

8ITANK  IM551  1 

5.00000) 

18a  128lbritt008 

Ifilc002 

1  4611  328 

9IAPC  IM113I 

5.00000)  : 

1801  128lbriu009 

Ifilc002 

1  2391  262 

I0IAPC  (MI  14 1 

5.000001 

180)  128/bnt(OIO 

Ifi  Ic002 

1  3571  251 

11ITRUCK IM35  1 

5.000001 

1801  128lbriti01 1 

Ifile002 

1  3381  355 

12JEEP  IM15I  1 

5.00000) 

a  128lbriu012 

Ifile002 

1  396 1  172 

13ITANK  IM551 1 

3.500001 

180)  128lbritt013 

Ifilc003 

1  5641  261 

14ITANK  IM551  1 

3.500001 

1801  128lbritt014 

lfilc003 

1  5551  303 

15IAPC  IM113I 

3.500001 

1801  128lbriu015 

Ifilc003 

1  2551  209 

16IAPC  IM 114  1 

3.5(XXa)i 

lSOt  128lbriu016 

Ifi'c003 

1  4161  193 

17ITRUCK  IM35  1 

3.500001 

1801  128lbriu017 

Ifilc003 

1  3731  358 

18UEEP  IM151  1 

3.500001 

a  128lbritt018 

Ifile003 

1  4761  87 

19ITANK  IM551  1 

3.500001 

18»  128lbriu019 

Ifilc004 

1  5561  274 

20ITANK  IM551  1 

3.500001 

1801  128lbritt020 

Ifile004 

1  5441  316 

21IAPC  (Ml  13  1 

3.500001 

1801  128lbriii021 

Ifilc004 

1  2491  222 

22IAPC  IM114I 

3.500001 

1801  128lbriu022 

Ifilc004 

1  4101  209 

23ITRIJCK  IM35  1 

3.500001 

1801  128lbria023 

Ifilc004 

1  3551  364 

24 JEEP  IM  15 1  1 

3.500001 

a  128lbritt024 

Ifile004 

1  4761  109 

25ITANK  IM551  1 

2.500001 

1801  128lbn«025 

Ifile005 

1  56  265 
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2.500001 
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2.500001 
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Ifilc005 
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2.500001 
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2.500001 
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2.500001 
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2.500001 

1801  I28lbrm031 

Ifile006 

1  5361  257 
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2.500001 
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Z 500001 

1801  128lbritt033 

IfilcOOfi 

1  971  193 

34IAPC  IM114I 

2.500001 

1801  128lbnit034 

Ifile006 

1  32a  182 

35ITRUCK  IM35  1 

2.500001 

1801  128lbnli035 

Ifilc006 

1  2621  348 

36JEEP  IM151  1 

2.500001 

a  128lbritt036 

Ifilc006 

1  387!  86 

37ITANK  IM551  1 

5.000001 

451  128lbnu037 

Ifilc007 

1  4691  249 

38ITANK  IM551  1 

5.000001 

451  128lbri([038 

Ifilc007 

1  4531  288 

391 APC  IM113I 

5.000001 

451  128lbritt039 

Ifilc007 

1  2351  222 

40IAPC  IM1I4I 

5.000001 

451  128lbriliO40 

IfilcOO? 

1  3391  214 

41ITRUCK  IM35  1 

5.000001 

451  128lbnit041 

Ifi  Ic007 

1  3681  343 

42UEEP  IM  15 1  1 

5.000001 

a  128lbriit042 

Ifilc007 

1  3981  115 

43ITANK  IM551  1 

5.000001 

451  128lbriu043 

Ifilc008 

1  47a  269 

44ITANK  IM551  1 

5.00000! 

451  128lbritt044 

Ifilc008 

1  4551  308 

45IAPC  IM113I 

5.00000) 

451  128lbriu045 

IfileOOS 

1  2401  240 

461  APC  IM114I 

5.000001 

451  128lbritl046 

Ifilc008 

1  3421  234 

47ITRUCK  IM35  1 

5.000001 

451  128lbri«047 

IfileOOS 

I  3681  353 

48JFEP  IM  15 1  1 

5.000001 

a  128lbrin048 

Ifile008 

1  4041  146 

49ITANK  IM551  1 

3.500001 

451  128lbmi049 

Ifilc009 

1  5041  234 

50ITANK  IM55I  1 

3.500001 

451  128lbritt050 

Ifilc009 

1  483!  282 

Figure  A.  1  A  sample  section  of  the  britt  database. 


A-5 


kak/yoder 


select  (attribute  list) 
from  (table  names) 

Some  of  the  optional  clauses  are: 

where  (expression  is  true  or  false) 
into  (an  ASCII  file) 

The  required  select  and  from  clauses  go  hand  in  hand.  The  select  clause  specifies  which 
attributes  (or  columns)  to  print  out  for  the  relations  (tables)  specified  by  the  from  clause.  For 
example: 

sql>  select  id,  type 
sql>  from  britt  / 

causes  the  id  and  type  columns  of  the  britt  table  to  be  displayed.  The  character  tells  the 
SQL  parser  that  we  are  done  entering  a  query  and  no  additional  optional  clauses  follow.  One 
may  print  out  the  entire  britt  table  with  the  following  command: 

sql>  select  * 
sql>  from  britt  / 

Here  the  **’  will  match  all  attribute  names,  and  so  all  columns  of  the  britt  table  will  be  printed. 

We  have  seen  how  to  select  specific  columns  of  our  tables,  but  not  specific  rows.  For  this 
we  need  to  make  use  of  the  where  clause.  The  where  clause  will  select  a  tuple  only  if  its 
expression  is  true  for  that  tuple.  For  example: 

sql>  select  * 

sql>  from  britt 

sql>  where  range  =  5.0  / 

will  select  all  the  tuples  with  values  of  5.0  for  their  range  field.  We  can  now  begin  to  make 
more  complex  queries.  The  query: 

sql>  select  target 
sql>  from  britt 

sql>  where  angle  =0.0  and  type  =  'TANK*'  / 

will  print  out  all  of  the  target  filenames  containing  front  views  (corresponding  to  aspect  angle=0 
degrees)  of  TANKs.  The  type  attribute  is  stored  as  a  string,  and  so  must  be  quoted.  The 
character  matches  all  trailing  blanks  in  the  field.  We  can  also  make  nested  queries: 
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sql>  select  target 
sql>  from  britt 

sql>  where  angle  =  0  and  type  =  select  type 
sql>  from  britt 

sql>  where  model  =  'M48*'  or 

sql>  model  =  'M551*'  / 

Starting  from  the  innermost  query,  the  tuples  with  model  values  of  M48  or  M551  (the  two  kinds 
of  tanks)  are  selected,  then  the  type  column  of  this  temporary  table  is  selected,  then  all  tuples  in 
britt  with  types  in  this  set  and  angle  values  of  0  are  selected,  and  finally  the  target  column  of 
this  temporary  table  is  selected  and  displayed.  The  final  result  is  the  same  as  the  previous 
query,  a  list  of  target  filenames  containing  front  views  of  tanks. 

The  operators  >,  <,  and  '=  (not  equal)  may  all  appear  in  the  where  expression.  Arithmetic 
expressions  using  the  operators  +,-,/,  and  *  and  attribute  names  used  like  variables  are  allowed 
anywhere  a  simple  attribute  name  is  allowed  in  select,  and  where  clauses.  The  following  is  a 
way  to  request  an  attribute  within  a  range  of  values: 

sql>  select  id 
sql>  from  britt 

sql>  where  type  =  'JEEP*'  and  angle  between  0.0  and  180.0  / 
which  will  give  the  obvious  result. 

The  order  of  the  query  output  can  be  specified  by  using  the  order  by  clause.  The  data  can 
be  sorted  by  multiple  fields  in  both  ascending  and  descending  order: 

sql>  select  id,  angle,  target 
sql>  from  britt 

sql>  where  range  between  2.5  and  5.0 
sql>  order  by  angle  asc,  id  desc  / 

which  will  get  the  id  number,  angle  of  view,  and  filename  of  all  targets  between  2.5  and  5.0 
kilometers  away  and  output  them  in  ascending  order  by  angle  and  descending  order  by  id 
number  for  targets  with  the  same  angle. 

The  last  important  part  of  the  query  language  is  the  ability  to  send  results  of  queries  to 
ASCII  files  for  use  as  input  to  other  programs.  The  "lines  0"  command  for  suppressing  the  table 
header  comes  in  handy  here.  An  example  of  such  a  query: 
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sql>  lines  0 
sql>  select  target 
sql>  from  britt 

sql>  where  type  =  'TRUCK*'  and 
sql>  angle  =  180  and 

sql>  range  =  3.0 

sql>  into  savefile  / 

Names  of  the  files  containing  rear  views  of  trucks  at  a  range  of  3.0  kilometers  are  saved  in  file 
savefile.  The  file  savefile  may  then  be  used  as  input  to  a  program  performing  classification 
experiments.  As  another  example,  the  following  query  could  be  used  as  input  to  a  program  to 
extract  targets  from  their  parent  files  for  a  newly  generated  database: 

sql>  select  parent,  x,  y,  size,  target 

sql>  from  britt 

sql>  into  extract . inf ile  / 

A.  1.4.  SELECTING  THE  BRITT  CLASSES 

With  the  above  tutorial  in  mind,  extracting  the  targets  needed  for  the  three  classes  used  in 
the  BRITT  classification  experiments  is  easy.  The  tank  s  class  consisted  of  50  M551  tanks  at  a 
range  of  2.5km.  The  following  request  would  extract  all  M551  tanks  at  2.5  km: 

sql>  select  target 
sql>  from  britt 

sql>  where  range  =  2.5  and  model  =  'M551*'  / 

The  apes  class  consisted  of  25  Ml  13  apes  and  25  Ml  14  apes  at  a  range  of  2.5km.  They 
were  selected  by  using  the  following  query: 

sql>  select  target 
sql>  from  britt 

sql>  where  range  =  2.5  and  [model  =  'M113*'  or  model  =  'M114*']  / 

The  final  class  consisted  of  M35  trucks .  There  were  not  enough  trucks  at  the  range  of 
2.5km,  so  trucks  at  all  ranges  were  used: 

sql>  select  target 

sql>  from  britt 

sql>  where  model  =  'M35*'  / 

All  of  the  above  queries  returned  more  targets  than  needed,  so  then  each  of  the  targets  was 
viewed  and  the  most  suitable  targets  were  used. 
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A.1.5.  CONCLUSIONS 

The  use  of  a  database  manager  has  greatly  simplified  the  selection  of  target  images.  All 
the  targets  from  the  Eglin  turntable,  the  NVL  terrain  board,  and  the  TI  data  set  that  we  have 
received  will  be  placed  in  the  database  so  that  they  can  be  easily  located  when  needed.  Addi¬ 
tional  attributes  such  as  background  clutter  and  image  quality  may  be  added  to  the  database  if 
they  will  help  in  the  target  selection. 
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A.2.  RANGE  DATA  FROM  STRUCTURED  LIGHT 

This  section  reports  on  the  acquisition  of  structured  light  range  data. 

A.2.1.  LIGHT  STRIPE  IMAGES:  WHAT  ARE  THEY? 

In  our  Robot  Vision  Lab  we  use  a  single-slit  projection  system  for  the  acquisition  of  3-D 
range  data.  Our  sensor  consists  of  a  projector  and  a  camera.  The  projector  illuminates  the 
scene  with  a  single  stripe  of  light,  and  the  camera  records  the  interaction  of  the  stripe  with  scene 
objects.  To  collect  range  data  from  the  scene,  a  robot  arm  moves  the  sensor  in  a  straight  line 
and  records  the  illuminated  stripes  at  equal  intervals  along  the  direction  of  motion  (see  Figure 
A. 2).  The  robot  arm  moves  in  the  direction  of  the  arrow  in  the  figure,  stopping  at  equal  inter¬ 
vals  along  the  way  to  collect  the  image  of  a  single  stripe. 

The  stripe  images  as  recorded  on  the  camera  can  be  translated  easily  into  what  is  called  the 
pixel  offset  data.  In  Figure  A. 3  we  have  shown  what  is  meant  by  pixel  offset  data.  In  that 
figure,  with  pixel  P  in  the  source-viewpoint  frame,  we  associate  the  offset  d(ij)  as  obtained 
from  the  location  of  the  corresponding  illuminated  pixel  in  the  camera  image.  This  pixel  offset 
data  can  be  translated  irto  a  range  map  of  the  scene.  The  offset  values  are  multiplied  by  a  cali¬ 
bration  matrix  to  obtain  the  (x,y,z)  coordinates  of  points  on  the  detected  stripe. 

A  program  has  been  written  to  convert  the  raw  (x,y,z)  data  of  the  range  map  into  an  image 
in  which  pixel  brightness  corresponds  to  distance  from  the  sensor  path.  The  resulting  range 
image  is  basically  an  orthogonal  projection  of  the  scene,  and  so  is  very  much  like  the  laser 
range  images  we  receive  from  NVL.  There  are  slight  differences,  however.  Due  to  occlusion 
problems,  the  range  images  we  generate  have  some  areas  with  no  valid  range  information.  This 
is  due  to  the  geometry  of  the  sensor.  Because  the  camera  is  next  to  the  projector,  there  will  be 
cases  in  which  the  projected  stripe  wilt  lie  on  some  part  of  the  object  that  is  hidden  from  the 
camera  by  some  other  part  of  the  object.  The  result  is  a  shadow-like  void  of  missing  informa¬ 
tion  in  the  range  image.  Also,  points  more  than  a  few  feet  from  the  scanner  are  not  detected 
well  due  to  the  spreading  of  the  stripe.  We  therefore  have  an  abrupt  falloff  in  valid  data  as  we 
move  far  enough  from  the  scanner.  The  distance  at  which  this  phenomenon  occurs  is  obvious 
when  one  observes  the  composite  stripe  or  range  images. 

A. 2.2.  OUR  SIMULATED  TARGET  RANGE  IMAGES 

We  have  constructed  scale  models  of  several  tanks  and  generated  range  maps  from  light 
stripe  images  taken  of  them.  Figure  A. 4  shows  composite  light  stripe  images  of  an  M48  tank 
model  taken  from  eight  different  aspect  angles.  Figure  A. 5  shows  the  corresponding  range 
images,  where  lighter  pixels  correspond  to  nearby  pixels,  darker  pixels  are  farther  away,  and 
white  pixels  are  areas  of  no  information  due  to  occlusion  and  spreading  of  the  stripe,  as  dis¬ 
cussed  above.  Note  the  artificial  contours  in  the  range  images,  which  is  especially  noticeable  in 
the  ground  plane  on  which  the  tank  sits.  The  false  edges  between  bands  are  due  to  the  gray 


Figure  A. 2  Light  stripe  image  collection  using  a  linear  scan  with  the  sensor  by  a  robot. 


Figure  A. 3  Shown  are  the  source  viewpoint  frame  (light  source  plane)  and  the  camera 

viewpoint  frame  (camera  image  plane).  Pixel  offset  d(i,j)  is  the  horizontal  dis¬ 
tance  in  the  camera  image  corresponding  to  point  P  on  the  j-th  stripe  projected 
by  the  source.  The  quantity  d(i,j)  is  measured  from  the  left  hand  edge  of  the  i-th 
scan  line  of  the  camera  image. 
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Figure  A. 4 
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Composite  light  stripe  images  of  a  model  of  an  M48  tank 
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value  quantification  of  the  image  hardcopy  device,  and  are  not  really  present  in  the  image  itself. 
A.2J.  CONCLUSIONS 

The  figures  show  that  we  are  able  to  generate  range  images  of  targets  in  our  lab.  Future 
work  in  this  area  will  include  down  sampling  the  images  so  that  the  number  of  pixels  on  target 
will  be  approximately  the  same  as  real  LADAR  images. 
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APPENDIX  B:  DETAILED  DESCRIPTION  OF  LADAR  DETECTION  PROGRAMS 


The  purpose  of  this  appendix  is  to  describe  the  LADAR  target  detector  developed  by  the 
Robot  Vision  Lab  at  Purdue,  and  how  to  use  the  programs  that  it  is  composed  of.  The  detector 
that  this  document  describes  is  the  first  attempt  at  detecting  tactical  targets  using  a  single  line  of 
LADAR  data.  The  theory  of  our  detector  was  describe  in  Section  3.2.4.  In  that  section,  we  pro¬ 
posed  a  pixel  based  detector  that  uses  the  range  value  of  a  pixel  and  the  ranges  values  of  the 
neighboring  pixels  to  classify  the  pixel  as  belonging  to  the  background  or  the  object  We  also 
discussed  some  aspects  of  detection  theory  and  explained  the  need  for  estimating  the  density 
functions  of  the  target  and  the  background  for  robust  detection. 

The  detection  process  is  non-trivial  in  that  it  requires  the  use  of  a  number  of  programs,  all 
of  which  interact  with  each  other,  its  use  .  Because  of  this,  there  are  a  number  of  sections  to  this 
appendix.  These  sections  include  a  description  of  what  the  detection  programs  do  and  how  they 
are  used.  A  description  of  the  data  files  that  the  programs  expect  to  see  and  examples  demon¬ 
strating  program  usage  are  also  included.  It  should  also  be  noted  that  the  programs  as  supplied 
can  work  with  up  to  50  dimensional  data  vectors  and  this  can  be  increased  by  merely  changing  a 
single  parameter  and  recompiling  the  programs. 

B.I.  PROGRAM  DESCRIPTIONS 

The  target  detector  consists  of  four  programs.  These  programs  are:  I2V  and  V2I  convert 
data  from  an  image  to  a  data  vector  and  back  again.  MAKECLASS  estimates  the  density  func¬ 
tion  of  the  target  and  background  and  stores  relevant  parameters  for  the  detector,  and  QCLASS 
is  the  actual  detector.  All  these  programs  will  be  described  in  greater  detail  in  the  following 
sections. 

B.1.1.  Operation  of  I2V 

The  purpose  of  I2V  (image  to  vector)  is  to  extract  data  vectors  from  an  image  so  the  fact 
that  the  data  vectors  are  coming  from  an  image  and  their  composition  is  hidden  from  the  detec¬ 
tor.  I2V  actually  works  in  two  modes  depending  on  whether  it  is  being  used  for  training  the 
detector  or  if  it  is  generating  vectors  for  detection.  These  two  modes  of  operation  will  be  called 
training  and  detection  modes  respectively.  Since  I2V  is  easiest  to  understand  in  detection 
mode,  we  will  discuss  that  first. 

In  detection  mode,  I2V  was  designed  to  generate  a  data  vector  for  every  pixel  in  the  input 
image  and  pass  these  vectors  to  the  detector.  To  be  precise,  the  data  vector  will  be  built  as  fol¬ 
lows:  If  the  target  is  at  the  farthest  range,  choose  N  range  values  from  the  scan  line  centered 
around  the  pixel  being  considered;  N  is  the  minimum  number  of  pixels  needed  to  guarantee  that 
an  entire  target  (plus  some  background)  is  covered.  These  range  values  are  then  used  to  build 
an  N  dimensional  vector.  If  the  target  is  closer  then  the  maximum  range,  down-sample  the  scan 
line  around  the  pixel  so  that  N  range  values  still  cover  the  entire  target  (plus  some  background). 
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In  this  mode,  I2V  reads  the  image  from  standard  input  and  writes  a  vector  file  to  standard  out¬ 
put. 

In  training  mode,  operation  of  I2V  is  similar  to  operation  in  detection  mode.  In  this  mode, 
the  data  vector  is  computed  identically  as  in  the  detection  mode;  the  only  difference  in  the  two 
modes  is  how  the  program  reads  the  image  files  and  where  it  writes  the  data  vector  files.  In 
training  mode,  I2V  reads  two  image  files,  one  containing  the  actual  image  and  a  second  one 
containing  the  segmented  version  of  the  image  (a  pixel  in  the  segmented  image  is  defined  to  be 
zero  if  the  pixel  is  in  the  background  nonzero  otherwise).  It  also  writes  to  two  data  vector  files, 
one  for  target  pixels  and  another  for  background  pixels.  After  the  data  vector  has  been  com¬ 
puted,  the  pixel  in  the  segmented  image  is  checked;  if  the  pixel  is  zero,  the  data  vector  is  written 
to  the  background  vector  file  otherwise  it  is  written  to  the  object  vector  file  (The  format  of  the 
types  of  data  files  will  be  described  later). 

B.1.2.  Operation  of  M A KECL ASS 

MAKECLASS  is  the  program  used  to  train  the  detector.  To  do  this,  it  creates  an  estimate 
of  the  density  function  and  writes  the  density  function’s  statistics  (mean  and  covariance  matrix) 
to  a  file  for  use  by  the  detector.  Because  the  density  function  was  assumed  to  be  Gaussian  (oth¬ 
erwise  the  process  would  be  intractable)  it  is  completely  described  by  these  statistics.  The  input 
to  MAKECLASS  is  a  data  vector  file  created  by  I2V  in  training  mode  and  the  output  is  a  param¬ 
eter  file  containing  the  necessary  statistics. 

B.1.3.  Operation  of  QCLASS 

QCLASS  is  the  actual  detector.  In  fact,  it  is  a  very  general  purpose  detector  and  can  work 
with  any  type  of  data.  It  reads  the  data  vectors  generated  by  I2V  in  detection  mode,  does  the 
classification  as  object/background  and  outputs  its  result  to  the  standard  output.  To  accomplish 
the  classification,  it  reads  in  a  data  vector  for  a  pixel  and  then  computes  the  value  of  the  target 
and  background  density  estimates  at  that  point.  If  these  values  are  called  f( target)  and  /(noise) 
respectively,  then  a  decision  that  the  pixel  is  part  of  a  target  is  made  if  /(target)  >  C  * /(noise) 
otherwise  the  pixel  is  said  to  be  noise.  QCLASS  then  outputs  the  decision  to  the  standard  out¬ 
put  (it  outputs  a  “1”  if  it  thinks  the  pixel  is  from  an  object,  “0”  otherwise).  Note:  C 
corresponds  to  a  threshold  and  depends  on  the  a-priori  class  probabilities  and  the  costs  of  false 
detections  and  detection  misses.  At  the  current  time,  C  is  set  equal  to  1. 

B.1.4.  Operation  of  V2I 

V2I  (vector  to  image)  is  the  program  that  reassembles  the  classified  pixels  back  into  an 
image.  It  merely  reads  in  the  “0’s”  and  “l’s”  produced  by  qclass  and  outputs  unsigned  chars 
(bytes)  in  the  form  of  an  image. 
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B.2.  PROGRAM  OPERATION 

B.2.1.  I2V  Operation 

The  following  are  the  command  line 
size=<image  size> 

rows=<#  rows> 
cols=<#  cols> 
x_in=<x  target  size> 
x_out=<x  dim> 
y_in=<y  target  size> 
y_out=<y  target  size> 

-mean 

skip=<sampling  density> 


parameters  that  I2V  uses. 

--  size  of  image  (for  square  images) 

—  or 

—  #  of  rows  in  image 

—  #  of  cols  in  image 

—  x  size  of  target  (in  pixels) 

—  x  size  of  target  (after  down-sampling) 

—  should  be  1  for  ladar  detection 

—  should  be  1  for  ladar  detection 

—  include  if  want  to  normalize  by  mean 

—  of  row  being  scanned 

—  used  when  training  detector 

—  skip  to  every  nth  pixel 

—  used  to  reduce  amount  of  data  processed 


to  use  only  when  training  classifier 

—  file  name  of  grey  scale  file  (input) 

—  file  name  of  segmentation  file  (input) 

—  file  name  of  background  vector  file  (output) 

—  file  name  of  object  vector  file  (output) 

—  Note:  the  output  files  append 

—  the  new  data  vectors  to  the 

—  end  of  the  corresponding 

—  files  so  samples  can  be 

—  gathered  from  multiple  files. 

—  Command  line  parameters  to  use  only  when  detecting  targets 

STDIN  -  input  image  file 

STDOUT  —  output  data  vector  file 

—  Note  that  if  x_in  =  x_out  and  y_in  =  y_out  then  no  resampling  is  done. 


—  Command  line  parameters 
grey=<file  name> 
seg=<file  name> 
back=<file  name> 
obj=<file  name> 
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B.2.2.  MAKECLASS  Operation 

The  following  are  the  command  line  parameters  that  MAKECLASS  uses. 

dim=<dimension>  —  data  vector  dimension 

-  should  be  equal  to  x_out  *  y_out 

—  as  sent  to  I2V 

STDIN  -  input  data  vector  file 

STDOUT  -  output  parameter  file 

—  Note:  the  name  of  the  output  file  should  have  the  form: 

ccore  name>.0  -  for  background  data  vector  file 
<core  name>.  1  —  for  object  data  vector  file 


B.2.3.  QCLASS  Operation 

The  following  are  the  command  line  parameters  that  QCLASS  uses. 


dim=<dimension> 


file=<file  name  core> 


<file  name  core>.0 
<file  name  core>.  1 


--  data  vector  dimension 
--  should  be  equal  to  x_out  *  y_out 

—  as  sent  to  I2V 

--  parameter  file  name  core.  Parameter 

—  file  names  should  be  <file  name  corex# 

—  see  the  example  commands  for  further 

—  clarification. 

—  background  parameter  file  name 
--  object  parameter  file  name 


STDIN 

STDOUT 


—  input  data  vector  file 

—  should  have  one  vector  per  pixel 

—  output  data  vector  file 
~  1  =>  object 

—  0  =>  background 

—  should  have  one  vector  per  pixel 
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B.2.4.  V2I  Operation 

The  following  are  the  command  line  parameters  that  V2I  uses. 


size=<image  size> 

rows=<#  rows> 
cols=<#  cols> 


—  size  of  image  (for  square  images) 

—  or 

—  #  of  rows  in  image 

—  #  of  cols  in  image 


STDIN 


STDOUT 


—  input  data  vector  file 

—  1  =>  object 

—  0  =>  background 

—  should  have  one  vector  per  pixel 

—  output  image 
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B.3.  SAMPLE  COMMANDS 

The  following  are  samples  provided  to  make  the  explanations  a  little  more  concrete. 

—  This  example  shows  how  to  train  the  detector 
—  The  first  two  (i2v)  commands  pull  training  samples  from  two  images 


i2v  x_in=49  x_out=25  y_in=l  y_out=l  rows=96  cols=160  -mean  skip=ll  # 
grey=images.324.im03  # 

seg=segment.324.im03  # 

back=background  obj=object 

i2v  x_in=49  x_out=25  y_in=l  y_out=l  rows =9 6  cols=160  -mean  skip=ll  # 

# 
# 


grey=images.324.im03 
seg=segment.324.im03 
back=background  obj=object 


—  Note  that  the  character  "#"  means  continue  the  command  on  the  next  line. 

—  The  next  two  commands  compute  the  density  function  statistics 

makeclass  dim=25  file=background  >  class.O 
makeclass  dim=25  file^background  >  class.  1 

—  This  example  shows  how  to  run  an  actual  detection  experiment 


i2v  x_in=49  x_out=25  y_in=l  y_cut=l  rows=96  cols=160  -mean 
<  image.324.im03 

I  qclass  dim=25  -v  fiie=c!ass  class.  1  class.O 
I  v2i  rows=96  cols=160  >  detect. 32403 


# 

# 

# 
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B.4.  DATA  FILE  TYPES 

There  are  three  types  of  files  used  in  the  detection  process,  they  are  image  files,  sample 
vector  files  and  parameter  files. 

B.4.1.  Image  Files 

The  image  files  consist  of  unsigned  character  (byte)  data  stored  in  the  usual  raster  scan 
fashion.  Therefore,  a  160  x  96  range  image  would  consist  of  15360  bytes  of  range  information. 

B.4.2.  Data  Vector  files 

The  data  vector  files  are  created  to  interface  with  the  detection  programs  QCLASS  and 
MAKECLASS.  These  are  ascii  files  with  one  data  vector  per  line.  The  lines  consist  of  N  float¬ 
ing  point  numbers  separated  by  spaces  or  tabs.  So  the  files  end  up  looking  like: 


samplelfO]  samplel[l]  ...  samplelfN] 
sample2[0]  sample2[l]  ...  sample2[N] 


sampleM[0]  sampleM(  1  ] ...  sampleMfN] 

Where  N  is  the  dimensionality  of  the  data.  The  following  is  an  example  of  a  (very  short)  5 
dimensional  data  vector  file: 


2.124582  0.466372 
0.449099  1.355932 
0.120965  2.307648 
1.262410  0.635385 
0.344403  0.041000 
1.256044  0.249545 
2.087861  1.588770 
0.507408  1.330907 
1.247726  1.214453 


1.442297  0.129548  0.475008  ;  1st  pixel 
1.053654  1.675483  2.020944  ;  2nd  pixel 
1.265484  0.027915  1.823786 
1.329293  1.103564  1.981398 
1.279213  1.254613  1.303813 
1.380816  1.106317  0.133091 
1.264362  1.006499  0.673772 
1.222771  1.505589  0.091500 
1.430725  1.347544  1.763452 


Notice  that  anything  after  a  comma  in  a  line  is  defined  as  a  comment  and  is  ignored. 
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B.4.3.  Parameter  Files 

The  parameter  files  contain  the  covariance  matrix  and  mean  vector  of  the  object  and  back¬ 
ground  density  functions.  Because  the  densities  are  assumed  to  be  Gaussian,  these  are  the  only 
values  needed  to  determine  density  function.  Thus,  the  detector  trainer  needs  only  to  store  these 
two  parameters  for  the  detector  to  use.  The  format  of  these  files  is  similar  to  the  format  of  the 
data  vector  files;  they  too  are  ascii  files.  The  first  N  valid  vector  lines  are  assumed  to  be  the 
covariance  matrix  of  the  class,  and  the  next  valid  vector  line  is  assumed  to  be  the  density 
function’s  mean  vector.  The  format  of  the  files  looks  like  this: 

covar[0,0]  covar[0,l]  ...  covar[0,N] 
covar[l,01  covar[l,l]  ...  covarf  1  ,N| 


covar[N,0]  covar[N,l|  ...  covar[N,N] 
mean[0]  meanfl]  ...  mean[N] 


Where  once  again  N  is  the  dimensionality  of  the  data.  A  sample  5  dimensional  parameter  file  is 
shown  next: 


0.373155 

0.0659588 

-0.0151386 

-0.0715956 

-0.0860195  ;  covariance 

0.0659588 

0.300289 

0.0780137 

-0.0323843 

-0.0698119 

-0.0151386 

0.0780137 

0.248101 

0.0794427 

-0.0335361 

-0.0715956 

-0.0323843 

0.0794427 

0.268688 

0.0508662 

-0.0860195 

-0.06981 19 

-0.0335361 

0.0508662 

0.304258 

0.989099 

1.01034 

1.03651 

0.957537 

0.922343  ;  mean  vector 

Once  again  note  the  ability  to  use  comments  in  the  data  lines. 


APPENDIX  C:  DERIVATION  OF  MAITRA’S  INVARIANT  MOMENTS 


This  appendix  is  not  intended  for  a  reader  well  conversant  with  the  theory  of  image 
moments.  In  fact,  even  a  reader  who  is  not  familiar  with  this  theory  might  wonder  about  why 
we  have  taken  the  trouble  of  rederiving  the  results  that  are  amply  documented  in  the  archival 
literature. 

Our  motivation  for  rederiving  the  expressions  snown  here  was  the  discovery  of  an  error  in 
the  moment-invariants  used  in  the  Martin  Marietta  report.  This  apparently  was  caused  by  a 
typographical  error  in  the  original  1962  paper  by  Hu  [Hu62];  this  error  being  subsequently 
reported  upon  by  Maitra  in  1979  [Ma79].  This  and  the  other  errors  that  Maitra  found  in  the 
literature  that  preceded  him  prompted  us  to  rederive  for  ourselves  all  the  major  results. 

In  what  follows,  we  will  rederive  expressions  for  the  region  level  features  used  in  [MM84], 
The  main  features  are  Moment  Invariants  based  on  Hu’s  paper  (Hu62],  Moment  invariants, 
while  invariant  under  rotation,  translation  and  scale  change,  are  not  invariant  under  illumination 
change.  Maitra’s  invariants  [Ma79]  are  invariant  under  illumination  change,  rotation,  transla¬ 
tion  and  scale  change;  and  thus  are  sufficient  to  characterize  an  image  under  these  transforma¬ 
tions.  However,  note  that  not  all  the  features  listed  in  Table  9  of  |MM84]  are  necessary  for  tar¬ 
get  characterization.  Specifically,  the  moment  invariants  and  Maitra’s  invariants  are  not 
independent. 


C.l.  Moments 

Since  Hu’s  paper  [Hu62],  moments  have  been  applied  to  pattern  recognition  problems. 
Features  that  are  invariant  under  translation,  rotation,  and  scaling  can  be  derived  from  the  two 
dimensional  moments.  The  two  dimensional  moments  represent  a  countable  collection  of 
weighted  averages  of  an  image  as  can  be  seen  from  the  following  definition: 

Definition  I :  The  two  dimensional  moment  of  (p+q)th  order  is  defined  by  the  following  Riem- 
man  integral, 

rrtpq  =\\xpyqp(x,y)dxdy 

for  p,q  =0,1,2,...,  where  p  is  a  piecewise  continuous  function  defined  over  a  finite  region  in  the 
plane. 

Moments  are  a  faithful  representation  of  an  image.  If  we  consider  the  image  in  the  ;;y 
plane  to  be  a  piecewise  continuous  function  whose  values  are  the  intensity  over  the  pixels  of  an 
image,  then  the  following  theorem  [Hu62|  asserts  that  the  moment  sequence  can  reconstruct  the 
image,  and  furthermore  the  moments  are  unique. 
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Theorem  2:  The  double  moment  sequence  \mpq)  as  defined  above,  is  uniquely  determined  by 
pCr.y)  provided  that  the  integration  is  carried  over  finite  region.  Conversely,  p(x,y)  is  uniquely 
determined  by  the  set  (mM).  See  (Hu62)  for  the  proof. 

C.1.1.  Central  Moments 

It  is  more  convenient  to  work  with  central  moments  which  correspond  to  moments  com¬ 
puted  about  the  centroid.  In  this  section  we  define  and  relate  the  central  moments  to  the 
moments. 


Definition  3:  The  Central  moment  of  (p+q) th  order  is  defined  by 

\ipq  =  JJ(-x  ~xf(y  -  y)q  p(x,y)dxdy  p,q  =  0, 1,2,...  (C.l) 

,  _  "Mo  m  oi 

where  x  = - and  v  = - . 

"i  oo  "  ™oo 

It  is  clear  from  (C.l)  that  central  moments  ppq  do  not  change  under  the  following  transfor¬ 
mation, 

x'  x+a 


/<-)>+P 


that  is,  \i'pq  =  \ipq  under  translation  of  coordinates  by  a,  P,  where  ) i'pq  are  the  moments  in  the 
translated  coordinate  system. 


Central  moments  can  be  obtained  from  the  moments  \mpq).  To  do  so,  consider  (C.l)  and 
use  the  binomial  expansion,  i.e., 


p  p 

(x-x)  =  £ 

k=Q 


k~\p-k 


(-1  Tx  x 


where 


P 

k 


P} 

{p-k)\k ! 


One  can  obtain  a  similar  expression  for  (y-y).  Substituting  the  binomial  expansions  in  (C.l)  we 
obtain 


,,  p  <7 

v-p? = JJe  h 

k  =0/  =0 


P 

k 


Q 

/ 


(-\)kUxkj/ xp  kyq  lp(x,y)dxdy 


by  arranging  the  terms  and  interchanging  the  order  of  summation  and  integration  we  get 


P  <7 

^7*7  =  X  X  ^ 

k  =0  7  =0 


k+l 


__k  _l 

X  y  mp-k.q-l 


(C.2) 
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Equation  (C.2)  yields  the  following  relationships  between  central  moments  and  moments 
of  order  less  than  or  equal  to  three: 


-  m  oo 

4oi 

=  Pto  = 

=  0 

^20 

=  m2  o 

-  xm  io 

Bn 

=  mn 

-yTmoo 

M02 

=  mo2 

-ym  01 

M-30 

=  m30 

-  3xm2o 

+  2x  m  io 

M-21 

=  m2\ 

-  2im  n 

-ym2  o  + 

L 

2x  tn0 j 

M-12 

=  m  12 

-2 ymu 

-xmo2  + 

2 y  m  io 

403 

=  m03 

-  3ym02 

_ 2 

+  2 y  m0 1 

Since  central  moments  are  translation  invariant,  and  are  uniquely  related  to  moments  by 
(C.2),  they  can  also  represent  images.  Thus,  from  now  on,  we  consider  only  central  moments. 

C.2.  Moment  Invariants 

We  now  define  the  notion  of  invariance.  The  main  idea  is  to  obtain  “invariants”  under 
various  transformations.  Specifically,  if  an  image  undergoes  a  change  in  size,  rotation  or  trans¬ 
lation,  the  “invariants”  do  not  change.  Thus  the  image  may  be  characterized  independently  of 
these  changes. 

Given  an  image  over  a  region  in  the  xy  plane,  moment  invariants  characterize  the  image 
independent  of  linear  transformations.  If  we  let  p  be  a  particular  set  of  moments  and  /  be  an 
invariance  function  over  the  set  p,  that  is,  / :  y  c  p  — >  /?,then  the  necessary  condition  for  invari¬ 
ance  can  be  stated  as 

/(p>/(p) 

where  p'  is  the  resulting  moment  set  under  transformations. 

In  order  to  maintain  theoretical  consistency  with  [MM84],  the  approach  followed  in  this 
report  is  that  of  Hu  [Hu62],  rather  than  Teague’s  [Te80],  although  the  latter  possess  the  advan¬ 
tage  of  simplicity. 

Before  we  proceed  with  moment  invariant  derivation,  the  following  definitions  are  neces¬ 
sary. 


Definition  4:  A  homogeneous  polynomial  of  the  form 
f  =  aP'0up  + 


r  *s 

P 

1 

ap-\.\up  1  v  +  • 

■  + 

P 

P~  1 

»  - 

a \,p-\uvp~]  +aQ  'Pvp 
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is  called  a  binary  algebraic  form  and  is  denoted  by 
f  —  ( @p,  0*&p— 1, 1  **<*»^o ,p  )  (^» ^ ) 


Definition  5:  Let  (  ap,Q\ap-iti;...\aotp  )  (w,v)  be  a  binary  algebraic  form.  A  homogeneous  poly¬ 
nomial  /(  aPt  o,  flp-1,1 , . . . ,  aoiP)  is  called  an  algebraic  invariant  of  weight  w  if 

I  (  ^  p,  0»  @  p— 1,0>  •  •  •  >  0  ,p  )  ^  l  (  &p,  0.  Qp  —1, !»•••»  ^  0,p  ) 


where  o>  a'p-i.i, ....  a'o.p  ^  resulting  coefficients  of  the  binary  algebraic  form  when 
(u,v)  is  transformed  into  (u',v')  by 


'  ■> 

r  I 

u 

a  Y 

u 

V 

P 

v' 

k  a 

i  j 

k  s 

and 


A  =  det 


Remark:  If  w  =  0  then  I  is  called  an  absolute  invariant,  otherwise  I  is  called  a  relative  invari¬ 
ant. 


The  key  element  in  relating  moments  to  the  algebraic  theory  of  invariants  is  the  moment 
generating  function.  As  shown  below,  the  moment  generating  function  can  be  expressed  as  an 
infinite  sum  of  binary  algebraic  forms.  Once  this  relation  is  expressed,  the  invariants  are 
deduced  from  the  theory  of  algebraic  invariants. 

With  definition  (4)  we  can  express  the  moment  generating  function  as  an  infinite  sum  of 
binary  algebraic  forms.  Recall  the  moment  generating  function  is  given  by, 

M  (u,v)  =  jje(ux+vy)p(x,y)dxdy 

where  p  is  a  piecewise  continuous  function  having  finite  support.  Expanding  e<-ux+vy'>  in  power 
series,  we  obtain 


M(u,v)=fj£ 


(ux+vyf 


p(x,y)dxdy 


p=o  P- 

and  interchanging  the  order  of  summation  and  integration,  we  get 
M{u,v)~  X  ~7  jjiux+vyf  p(x,y)dxdy 

p=qP  ■ 


m(u,v)=£-Ljj£ 

p=q"  ■  *=0 


up  kvkxp  kykp(x,y)dxdy 
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°o  .  p 

I  —  I 

p=oP  ■  a=0 


up 


and  noting  that  the  inner  summation  term  is  (|ip,  o;  '» Po ,p)  ( u>v ) to  obtain 

2 

P7oP 


m  (u,v)  =  £  —  oip.0;  i^p-1.1 ;  •••;  Po.P)  («-v) 


(C.3) 


When  applied  to  moments  [Hu62],  the  theory  of  invariants  tells  us  how  to  extract  features 
from  an  image  such  that  the  features  remain  unchanged  if  the  image  undergoes  linear  transfor¬ 
mations.  The  following  is  a  derivation  of  the  invariants  under  the  following  transformations: 


(C.4) 


r 

'  •> 

r 

JC' 

a  p 

JC 

y 

y 5 

y 

.  J 

"  -N 

u 

a  y 

U 

V 

»  - 

P  6 

v' 

- 

ux  +  vy  =  u'x'  +  Vy'. 


(C.5) 


Let’s  consider  the  change  in  the  moment  generating  function  under  (C.4)  and  (C.5).  Recall 
equation  (C.3) 


“  1 

M(uy)=  r  (fip.o;  Ho,P)  («>v) 

p=oP  ■ 

and  the  definition  of  the  moment  generating  function 
M  (u,v)  =  jje(ux+vy)  p(x,y)dxdy 

by  applying  the  transformations  (C.4)  and  (C.5)  to  (C.7),  we  obtain 

=  jj£  -L  (u'x'  +  v'yyp'ix',/)-^-  dx'dy 
p^)P-  l‘/l 


(C.6) 


(C.7) 


with 


171  = 


det 


5  -[3 

-y  a 


P  =P 


Similarly  we  can  obtain  for  equation  (C.6) 

M'(u',V')  =  £  ~  (»'p,o;...y0'p)  ( u',v ') 

provided  that  the  transformed  moments  are  defined  by 

[i-'pq  =  \\x'py'qp'(x\y')dx'dy  p,q  =  0,1,2,... 


(C.8) 
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From  the  theory  of  algebraic  invariants,  the  transformation  law  for  the  binary  algebraic  form 

(nP,o;  Hp-1.1;  ••••;  Mo  ,P)(u,v) 

is  the  same  as  that  for 

(ux  +  v yf  =  (xp;  xp~]  y\ ... ;yp )  ( u,v ). 

Thus  if  we  let  aPi  o»  •  •  • »  ao.p  be  the  moments  in  (C.3),  and  combining  with  (C.6)  and  (C.8)  we 
have  the  following  theorem  [Hu62], 

Theorem  6:  If  the  algebraic  form  of  order  p  has  an  algebraic  invariant  of  weight  w,  then  the 
moments  of  order  p  have  the  same  algebraic  invariant  with  the  additional  factor  17  1,  that  is, 

/(M'p.o. . Mo ,p)=  I  Aw/ (\xp% o, . . .  .Mo.p) 


C.2.1.  Invariants  under  scale  change 

Consider  the  following  similitude  transformation 


- 

Jt' 

a 

0 

'  1 

X 

/ 

0 

a 

« 

y 

h  j 

aeR 


Each  coefficient  of  any  algebraic  form  is  an  algebraic  invariant  of  weight  p  +q,  that  is 

a' pq  =  ap+q  apq. 

By  applying  Theorem  3  we  obtain, 

M 'Pq  =\J\up+q  \ipq 

Under  Similitude  transformation  we  have  171  =  a  ,  which  yields 

Mm  =  a2  ap+q  \ipq 

combined  with  the  zero-th  order  moment  relation 


VM'oo 
Moo 


to  yield. 


,  Moo  ,Moo  .  2 
M  pq  ~  (  )  M pq 

pq  Moo  Moo  pq 

M  pq  M pq 


p+q 


Moo  Moo 


+i 


(C.9) 
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Equation  (C.9)  defines  absolute  moment  invariants  under  similitude  transformation. 


Remark  :  Since  }iio  =  Poi  =  0  equation  (C.9)  is  nontrivial  for p+q  =  2,3,.... 


C.2.2.  Moment  Invariants  under  Orthogonal  Transformations 

Now  we  consider  invariants  under  orthogonal  transformations.  Rotation  of  coordinate  sys¬ 
tems  is  an  element  of  such  transformations.  Define  the  proper  orthogonal  transformation 


r  ' 

/ 

X 

r 

COS0 

"3 

sin0 

-  " 

X 

/ 

y 

— sin  0 

cos0 

y 

*  > 

J  =  1 


then  the  moment  invariants  are  [Hu62]: 


Ip,  o  ~  Up  o  f 


M-p-i.i  - 


Up -2, 2  +  i 


Up-3.3  +  ■  •  •  +  /PUOp 


Ip- i.i  -  (Upo  +  Up-2.2) -  /(p-2)(np_u  +  Up-3,3)  +  ' ' '  +  (-i)p+2(^2,p-2  +  Uop) 


Ip-2,2  =  (Upo  +  2|Ip_2,2  +  Up-4.4)  -t(p-4)(Up-i.i  +2\ip-\3  +  Up-5,5)  + 


•  •  •  +  (-if  4(^4p_4  +  2\l2%p-2  +  Mop) 


Ip-r,r  [(UpO’Up-2,2’-’Up-2r,2r)(l’l)  >  (Up-1,1 ’Up -3,3 
’•■•’(U2/,,p-2r’U2r+2,p-2/,-2’"-’U0p)()’  0  ]()»~ ^ 


’  Up -2/- -l,2r+l  )0 »  ) ) 
p  —2r  >0 


/fL,/L=UpO  + 
2  2 


p/2 

1 


Up -2, 2  + 


p/2 

2 


Up-4,4  +  ‘  ‘  +  U0p 


for  even  p 


I  =  / 

lr,p-r  1  p-r,r 

where  *  is  complex  conjugation. 

The  invariants  obtained  from  the  second-order  moments  are  given  by  /  n  and  / 02/ 20  • 

From  the  third  order  moments,  we  get  /30/03,  / 21  ^12  ^  (^30  M2  +^03  ^21  )>  a,'d 
Itt 

—  (/ 30  I \2  ~1 03  1 21  )>  a  skew  invariant.  Another  invariant  may  be  derived  from  a  combination 
of  second  and  third  order  invariants:  (/20  ^ 22i  +^02^22i)- 


C-8 


kak/yoder 


In  general  for  p>  4  there  are 


P_ 

4 


invariants  given  by  Ip0  lop. 


Ip- 1,1  l Ip-r,r  Ir,p-n  and  also,  when  p  is  even,  where  ;tj  is  the  smallest  integer 

greater  than  or  equal  to  x.  Also  combined  with  (p- 2)  moments  we  have  [y  -1  ]  invariants. 


Up- 1,1  ^O.p-2 +f  l,p-l  Ip- 2,o)> 
Up-1,2  I  \,p  -3  +12,p-2  Ip- 3.1 ) 


Up—r,r  I r-\,p-r+\  +  Ir,p-r  Ip-r+\,r-\  )  P  2/"  >  0 
combined  with  second  order  moments, 

/2[f  ].[|-]+ 1 1 20+ 12[£-}+ 1 02  if  P  is  Odd 

/A-1  -£-+1  Iio  +I2-+1  ^--1  Io2  if  p  is  even 
2  2  2  2 

which  give  us  a  total  of  (p+l)  independent  invariants. 

C.2.3.  Summary  of  moment  invariants 

The  central  moments  {\ipq}  are  invariant  to  translation  of  coordinate  system.  The  invari¬ 
ants  under  similitude  transformation  (scale)  are  defined  by  equation  (C.9).  In  Section  C.2.2  we 
listed  the  invariants  under  orthogonal  transformation.  These  can  be  combined  to  produce  scale, 
rotation  and  translation  invariants  as  follows: 

Let  <(>i, . . .  ,<J>7  be  the  invariance  functions  for  second  and  third  order  moments,  (obtained 


by  evaluating  the  second  and  third  order  invariants  from  section  C.2.2)  then 

0i  =  (U20  +  H02)  (c.io) 

02  =  (X20  -  X02)2  +  4(1?!  (C.ll) 

03  =  4*30  -  3(i12)2  +  (3(^21  -  ^03)2  (C.12) 

04  =  (H30  +  H12)2  +  (H21  +H03)2  (C.  13) 

05  =  (M-30  -  3|i] 2X13.30  +  M-l 2 )f (M-30  +  Xl2)2  -  3(p.21  +  M03)2]  (C.14) 

+  (3)321  ~  M03)(M-21  +  Xo3 )[3()l3o  +  M-12)2  “  (X21  +  Hm)2] 

06  =  (X20  -  X02)[(X30  +  X12)2  -  (X21  +  H03)2]  (C.15) 


1 
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+  4Mll(M30 +Ml2)(M21  +  M03) 

4>7  =  (3(121  -  M03XM3O  +  Ml2)f(M30  +  Ml2)2  “  3((J-2l  +  M03 )2 ]  (C.16) 

-  (p.30  -  3(l12)((l21  +  M03)[3(M-30  +  M-12)2  “  (M21  +  Mttt)2] 


Equations  (C.10)  through  (C.16)  are  the  invariants  under  translation  and  rotation.  To  make 
them  invariant  to  scale  change,  normalize  according  to  (C.9),  that  is,  define  by. 


*1  pq  ~ 


V-pq 
p+q 
Moo  2 


+i 


(C.17) 


and  replace  \xpq  by  r\M  in  (C.  10)  through  (C.16). 


Note:  The  normalization  given  by  (C.17)  disagrees  with  the  normalization  given  in  Table  9  of 
[MM84],  We  also  note  typographical  error  in  the  normalization  equation  (30)  of  [Hu62]. 


C.3.  Maitra  invariants 

Moment  that  are  invariants  under  scale,  rotation  and  translation  may  be  sufficient  to 
characterize  the  image.  However,  if  it  is  desired  to  have  invariants  under  illumination  changes, 
the  new  invariants  must  incorporate  the  illumination  conditions. 

Maitra  [Ma79],  has  considered  this  problem  and  obtained  illumination  invariants  denoted 
by  Maitra  invariants.  These  invariants  incorporate  scale,  rotation,  translation  and  illumination 
changes  and  are  based  on  moment  invariants. 


Let  g  i  ( x,y )  and  g2(x,y)  be  two  grey  value  images  related  by  the  following  transformations, 


gi(x,y)  =  k  g2(x',y')  k*  0 


(C.18) 


r  7 
/ 
X 

cos0  sin0 

/ 

y 

=  (X 

-sinQ  cos0 

»  * 

a 

b 


(C.19) 


where  0  is  an  angle  of  rotation,  (a, b)  is  translation,  a  is  a  scale  factor  and  k  is  the  change  in 
illumination. 


To  obtain  Maitra  invariants,  compute  the  moment  invariants  for  giOt.y)  and  g i(x\y')  to  obtain 
4>i . ,<(?7  and  <J)'i , .  . .  ,<{>'7  respectively.  The  relations  among  the  invariants  are, 

0i  =  \  <H’i  (C.20) 

or 

k2 

h  =  ~  <D'2 
or 


(C.21) 
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4*3  = 

k  * 

-Tofa 
a  u 

(C.22) 

4*4  = 

k 2 

-IF  4*4 
a  u 

(C.23) 

4*5  = 

/t4 

4o-4*'5 

CTU 

(C.24) 

4*6  = 

k 3 
a14 

(C.25) 

4>7  = 

k‘ 4  , 

„20 

(C.26) 

with  the  zero-th  order  moments  (loo  >  l^oo 
k  , 

Moo - y  ^  oo- 

or 

By  eliminating  the  constants  from  (C.20)  through  (C.26)  the  following  become  invariants  under 


(C.18)  and  (C.19)  called  Maitra  invariants. 

4*i 

(C.27) 

R  __  4*3  HOO 

P2  (fetD, 

(C.28) 

03=^ 

4*3 

(C.29) 

p-f7 

(C.30) 

P5  =  <t>6 

4>44>i 

(C.31) 

4*7 

p6=r- 

4*5 

(C.32) 

Equation  (C.27)  through  (C.32)  are  invariant  under  (C.18)  and  (C.19).  However  p4  is  undefined 
when  <J>5  is  negative.  To  eliminate  the  ambiguity  caused  by  the  square-root,  we  will  use 


in  addition  to  p, ,  fo.  p3,  Ps>  p6.  as  defined  in  (C.27)-(C.29),  (C.31),  (C.32). 
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I 

We  note  another  disagreement  in  p2  of  Table  10  in  [MM84], 

C.4.  Conclusions 

Maitra’s  invariants  are  invariant  under  the  transformations  given  in  equation  (C.18)  and 
(C.19),  and  thus  the  moment  invariants  are  redundant  as  features  but  are  necessary  to  obtain 
Maitra  invariants. 

The  moment  invariants  are  truly  invariant  only  in  the  continuous  domain.  For  the  case  of 
digital  imagery,  some  of  the  invariance  is  lost  due  to  undersampling  and  quantization  errors. 
Both  these  errors  decrease  as  the  image  size  increases  [TeCh86].  Maitra’s  invariants  tend  to  be 
less  sensitive  to  the  size  of  an  image.  It  is  of  course  possible  to  investigate  the  various  effects 
that  influence  the  invariance  property  of  different  types  of  moments. 


APPENDIX  D:  A  PARTIAL  LIST  OF  TWIN  ROUTINES 


{ 


A  partial  list  of  TWIN  routines  for  creating  and  manipulating  TWIN  objects: 


Function  Name 

Description 

Cone 

Create  a  boundary  representation  of  a  cone  approximation. 

Cylinder 

Create  a  boundary  representation  of  a  cylinder  approximation. 

EllCone 

Create  a  boundary  representation  of  an  elliptical  cone  approximation. 

Ellipse 

Create  a  boundary  representation  of  an  ellipsoid  approximation. 

Fillet 

Create  a  boundary  representation  of  a  fillet  shape. 

Ppiped 

Create  a  boundary  representation  of  parallelpiped  (box). 

SolidTorus 

Create  a  boundary  representation  of  a  solid  torus  approximation  (torus 
with  no  hole). 

Sphere 

Create  a  boundary  representation  of  a  sphere  approximation. 

Torus 

Create  a  boundary  representation  of  a  torus  approximation. 

Wedge 

Create  a  boundary  representation  of  a  wedge. 

Combine 

Perform  a  boolean  operation  on  two  TWIN  objects.  The  operation  can 

cither  be  union,  subtraction,  or  intersection. 

ReadObj 

Read  a  TWIN  object  from  a  file  and  return  a  pointer  to  it. 

WriteObj 

Write  a  TWIN  object  to  a  file. 

render 

Create  a  rendering  of  a  TWIN  object. 

show  ev ldence  {Der 1 ved  from  Answer,  H) 

write [Derived)  ,  wrlte(*  from*),  %  Show  rule 
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(bind  <f 1 rst -sur face>  (lit va i  surfaces)) 

(bind  <second-surf ace>  (compute  < f 1 rst -sur face>  ♦  1)) 

(modify  <ob)ect>  '‘surfaces  (substr  <obJect>  <second-sur face>  Inf)  nil) 


